Methodology

How Civilization's First Exam scores model governance under uncertainty.

Scoring Core

Regen Score = w_objective * Objective Total + w_decision * Decision Integrity

Objective Total reflects survival, stability, equity, adaptability, and welfare. Decision Integrity reflects counterfactual quality, judge consistency, and anti-gaming reliability.

Deterministic replay checksums Versioned scenario + scoring contracts Audit log snapshots per run

Evaluation Pipeline

Step 01

Scenario Execution

Run model adapters across stress scenarios and fixed seeds with trace capture per tick.

Step 02

Counterfactual Checks

Compare chosen actions against sampled alternatives to measure decision quality deltas.

Step 03

Judge Reliability

Apply pairwise swap consistency and disagreement instability diagnostics.

Step 04

Ranking + Governance

Publish ranked outcomes with anti-gaming policy and dispute/appeal workflow.

Bias and Gaming Controls

Pairwise position swap to reduce order bias in judge outcomes.
Verbosity normalization to prevent rationale-length exploitation.
Template and low-information response detectors in decision traces.

Not a full AGI safety certification Scenario-limited by published benchmark packs