← Blog

Authority Doesn't Verify Reality

Pre-execution authority over autonomous AI systems has a particular shape worth arguing for: a layer that evaluates a canonical representation of the action against an explicit policy before the action runs, and produces a tamper-evident record of the decision that another party can verify without trusting the system that produced it. Determinism, fail-closed behaviour, vendor substitutability — these are what separate governance from logging with confidence. The argument has not been hard to make. The properties matter, many people in the room agree they matter, and once anyone has spent enough time looking at incident reports it becomes difficult to believe that any of them are optional.

What has changed in the past few months is the arrival of a series of architectural proposals that converge on substantially the same cluster of properties from different starting points. The vocabulary varies — execution governance in one paper, runtime attestation in another, policy-bound action gating in a third. The cluster is the same, and much of what I would say is inside it.

There is a class of failure that none of it addresses.

The Decision Is Right. The World Has Moved.

A system can remain deterministic, policy-compliant and internally coherent while the operational reality the action was built from has already expired. The decision is structurally valid; the receipt verifies; the chain holds. And the action is still wrong, because the world the system was reasoning about no longer exists.

The properties many safe-execution stacks promise are properties of the decision given a representation. They are silent on whether the representation still corresponds to anything outside the system at the moment the action lands.

The Shape Varies

It shows up in a few recognisable forms.

Sometimes the delegation under which the action is being taken is technically still valid, within its time bounds and in scope on paper, but the principal’s intent has shifted and no one has narrowed or revoked it yet. Sometimes the chain of state the action assumed has been severed across a process restart or an agent handoff, and the new context inherits the authority without the operational continuity. In other cases the action is being decided against the slice of the world the system can see, while the consequence-bearing facts sit in the slice it cannot. Or two internally coherent sub-models have drifted into mutual disagreement and neither side has detected it. Or an upstream schema has changed under the system; the bindings still parse, but they no longer mean what they meant. And in the case I find hardest to characterise, the world simply moved during the decision window, and the action is executed against a snapshot that no longer exists.

These look like different problems. They are the same problem expressed against different surfaces. The action is structurally valid, the evidence will hold up under audit, the decision will reproduce on replay, and the action lands in a world the system was no longer entitled to reason about.

Why Widening Authority Will Not Fix This

The tempting response is to make authority do more work, to have it look at more state and verify the world before approving.

I have proposed this to myself several times. Each time the cost has turned out to be worse than the failure it would have addressed.

Authority is narrow on purpose. Its job is to decide whether a specific action, given an explicit delegation and a versioned policy, is permitted, and to produce evidence of that decision that can be reconstructed without trusting the system that produced it. The properties that make this possible are tightly coupled: the decision must be a pure function of its inputs, the evaluation must be deterministic, and the receipt must be byte-identical when the inputs are the same. These are not stylistic preferences. They are what allows a governance layer to mean the same thing across two vendors or two infrastructures, what allows an auditor to reconstruct a decision years later, and what makes governance into something an organisation can buy as infrastructure rather than as a vendor feature.

Asking authority to also verify whether the world is still the world breaks each of those properties. The decision stops being a pure function of its inputs; it now requires live observation. The evaluation stops being deterministic; it now turns on inference about freshness. And the receipt stops being reproducible at all, because two evaluations of the same canonical action will produce different decisions depending on what state happened to be visible at the moment each was made.

There is a harder problem sitting underneath that. Some of the failure modes I listed have clean answers. Others do not. A delegation that has expired has a clean answer: cryptographic freshness on the revocation record, denial when the freshness is missing or stale, the kind of mechanism that can be audited and substituted across systems. A canonical schema that has been quietly mutated has a clean answer too: versioned bindings, re-compilation of every policy that depended on the old shape, refusal of receipts written against a contract that no longer means what it once did. But the world moving during a decision window does not have a clean answer in the general case. There is no way, by inspection, to know whether the observation taken two hundred milliseconds ago is still the observation that should bind the action.

Selling a guarantee against that class of failure means selling a guarantee that cannot be formally made. Folding it into authority means contaminating authority with a property it cannot deliver, while losing the properties it can.

What Authority Already Covers

Before going further I want to mark which of those failure modes a well-designed authority layer does in fact handle. The line matters, and I do not want the argument to be read as a claim that pre-execution authority is incomplete in ways it is not.

A delegation whose revocation status is unverifiable, expired or beyond its freshness bound is exactly the case pre-execution authority is supposed to refuse, and a fail-closed implementation will refuse it. A canonical schema that has been quietly mutated should produce a different policy identity and therefore a different receipt, with a denial standing until the policy has been re-compiled against the new shape. A composition that looks coherent at each end but has lost its authority connection in the middle is the case I find most often overlooked; the answer is chained component receipts, a delegation chain that narrows monotonically as it composes, and a coordination artefact whose own schema is declared and verifiable, so the seam itself can be audited.

What is not in scope, and what I deliberately do not claim authority should cover, is the freshness of the agent’s world model, the truth of the observations the action was built from, and the question of whether the operational substrate the system is reasoning about has asynchronously moved on while the system reasoned. Those are different problems with different evidence requirements. Treating them as if they were the same problem is how architectures end up promising more than they can deliver.

A Plane That Sits Beside Authority

The cleaner architecture, and the one I keep returning to, is to give operational-state freshness its own plane.

The new plane sits alongside authority, not inside it. It produces its own evidence, makes its own falsifiable claims about freshness, scope and bounds, and exposes those claims as inputs that the authority decision is allowed to require. In practice that means the canonical representation of an action carries, when the policy demands it, a signed assertion of operational-state freshness, issued by a system whose job is operational-substrate trust, scoped to a window the policy bounds. The authority decision is unchanged in shape. It checks that the freshness assertion is present, recent, in scope and verifiable. If any of those fails, the decision denies.

Authority stays narrow, and its determinism, reproducibility and substitutability survive intact. The freshness layer, by contrast, has its own well-defined surface: its own version of fail-closed, its own proof obligations, and, importantly, its own buyer inside the organisation, because operational-substrate trust is not a problem the team responsible for governance will solve on its own. The two compose at the action boundary, and their receipts compose into an end-to-end record that an external party can examine.

This is the same separation-of-concerns move that produced the distinction between governance and identity, between governance and logging, and between governance and content guardrails. Each of those distinctions exists because the moment one plane starts to absorb the work of another, the structure that makes the whole stack legible collapses. Operational state belongs in the same conversation.

Where The Line Sits

The properties many safe-execution architectures advertise (determinism, fail-closed enforcement, evidence-producing decisions) are properties of the decision given a representation. They cover one important class of question, and the field is better off for having converged on them. They do not, however, cover the question of whether the world the action was built from is still real, and anyone selling them as if they did is selling a guarantee that cannot be formally made.

The wrong response to that gap is to keep widening the authority layer until it tries to absorb the problem. The right response, as far as I can see, is to name the layer that sits beside it, give it its own proof obligations, and require its evidence as an input the authority decision is allowed to consult. Authority then decides what it has always decided: whether the action is permitted under the delegation it was given. The question of whether the world is still the world is decided somewhere else, by a system whose proof obligations are honest about what they are.

Those are different layers. The boundary between them holds.