Why Runtime Governance Isn't Enough

The gap between observing and controlling

Every month a new agent governance project arrives with the same pitch: intercept tool calls, evaluate policy, log the outcome. The pattern descends from service-mesh sidecars and API gateways — battle-tested infrastructure that works brilliantly for HTTP traffic between microservices.

But AI agents are not microservices. A microservice calls a known endpoint with a deterministic payload. An agent decides at runtime which endpoint to call, with which payload, potentially rewriting both based on the output of a language model. The governance surface is not the network hop — it is the mutation itself.

When the action is a database read or a search query, runtime interception is sufficient. When the action is a $50,000 wire transfer, a prescription renewal, or a production deployment, the question shifts from "was policy checked?" to "was the exact approved mutation the one that executed?"

The core distinction

Runtime governance answers: "should this action be allowed?" Execution control answers: "is the action that executed exactly the one that was approved, and only once?"

This is not a criticism of runtime governance tooling — those projects solve real problems. It is an observation that for high-consequences actions, the governance layer and the execution layer must be the same system. Separating them creates a gap that no amount of logging can close after the wire has moved.

Anatomy of an uncontrolled agent action

Consider a support agent that issues refunds. The happy path looks fine: customer requests a refund, agent checks eligibility, agent calls the payment processor. The architecture review passes. The demo works. Then production happens.

Five things that go wrong

The agent calls the payment API directly. If the agent runtime holds the payment processor's API key, any code path that constructs an HTTP request can execute a refund. Policy middleware only catches calls that route through the instrumented tool. A direct httpx.post() bypasses it entirely.
The approved request gets changed. Policy evaluates at decision time. Between the policy check and the actual HTTP call, the payload can be modified — by a downstream function, a retry handler, or a prompt injection that alters the tool-call arguments. The mutation that executes is not the one policy approved.
The same refund fires ten times. Network retries, agent loops, and framework-level retry decorators can replay the same action. Each replay is a real financial transaction. Runtime governance that checks "is this allowed?" returns yes every time because the request is valid — it just should not execute again.
Credentials leak from the runtime. If the agent process holds downstream API keys in memory or environment variables, a container escape, memory dump, or log exfiltration exposes permanent access to the payment processor. The blast radius is not one bad action — it is every action the key permits.
An unsupported endpoint gets called. The agent discovers a new API path via documentation retrieval or prompt chaining. If there is no allow-list at the execution boundary, the call goes through. The schema was never validated. The policy never evaluated it. But the money moved.

These are not exotic attacks. They are Tuesday in a production agent system. Each one exploits the same structural gap: the governance layer sits beside the execution path, not on it.

What an execution rail actually is

An execution rail is a constrained, mediated path between the agent's intent and the downstream system's mutation endpoint. The key properties:

Exclusive path. The agent cannot reach the downstream mutation endpoint except through the rail. The agent holds no downstream credentials.
Binding. The approved request shape is cryptographically locked before execution. Any modification — even one byte — invalidates the approval.
Single-use. Each approval token is consumed on execution. Replay is rejected at the cryptographic layer, not the application layer.
Forward-time credential retrieval. Downstream API keys are fetched at the moment of forwarding and never returned to the caller. The agent runtime never sees them.

Agent Holds only IC key

Authenticate API key + agent ID

Validate Schema + action match

Policy Deterministic eval

Permit Ed25519 signed JWT

Credentials Retrieved at forward-time

Forward Exact approved payload

Receipt Full audit trail

The IntentChain execution pipeline. Every step happens on the mutation path itself — there is no side-channel.

This is what makes an execution rail different from a proxy or a middleware hook. The rail does not observe the action and decide whether to allow it. The rail is the action. If the rail denies the request, the action does not happen — not because a log entry says it should not have, but because the only path to the downstream system runs through the rail.

The permit model: cryptographic proof of approval

Most governance systems produce a decision: allow or deny. IntentChain produces a permit — an Ed25519-signed JWT that binds the exact request parameters, the agent identity, the canonical action, and a hash of the payload.

What the permit binds

Agent identity — which agent requested the action
Canonical action — the typed operation (e.g., money.payment)
Request hash — SHA-256 of the exact payload bytes
Policy version — which policy ruleset was evaluated
Expiry — 60-second TTL by default
Unique ID (jti) — consumed on use, preventing replay

The gateway verifies the permit signature before touching any credential or opening any connection to the downstream system. If the payload hash does not match — because the agent, a retry handler, or any intermediate code modified the request — the gateway rejects it. There is no "almost valid" permit.

Why Ed25519?

Ed25519 provides fast signing and verification (sub-millisecond on commodity hardware), compact signatures (64 bytes), and is resistant to timing side-channels. Per-tenant keypairs mean a compromised key affects one tenant, not the platform.

The design choice here is deliberate: the permit is not a bearer token that grants broad access. It is a time-bound, single-use authorization for one specific mutation with one specific shape. After the 60-second window or first use, it is dead.

Credential separation as architecture

Most agent security discussions focus on what the agent is allowed to do. Fewer focus on what the agent is allowed to hold. This matters because the blast radius of a compromised agent depends entirely on the credentials in its runtime.

An agent with a Stripe API key in its environment can issue unlimited refunds to any customer. An agent with only an IntentChain API key can request refunds — but each request is schema-validated, policy-evaluated, and bound to a single-use permit before the gateway retrieves the Stripe key, makes the call, and discards the credential from memory.

The credential never reaches the agent

IntentChain's gateway retrieves downstream credentials from the configured credential store (Azure Key Vault, AWS Secrets Manager, or any provider behind the credential interface) at the moment of forwarding. The credential exists in gateway memory for the duration of one HTTP call. It is never returned in the response to the agent. It is never logged.

This is not just a security best practice — it is a structural guarantee. The agent cannot leak what it does not have. A container escape from the agent runtime yields an IntentChain API key, which can only request governed actions through the rail. The blast radius collapses from "every action the downstream key permits" to "actions that pass validation, policy, and permit issuance."

Credential-in-runtime is the default today

Most agent frameworks expect you to inject secrets as environment variables or pass them via tool configuration. This works for demos. In production, it means every agent process is a credential store with the attack surface of whatever LLM is running inside it.

Where each approach holds and breaks

No single approach covers everything. The question is which properties matter for your specific action class. Here is where each model holds and where it does not:

Property	Framework middleware	Generic API proxy	Execution rail
Catches tool calls inside the framework	✓ Yes	✕ Not instrumented	✓ Yes (exclusive path)
Catches direct HTTP calls outside framework	✕ Bypassed	✓ If routed through proxy	✓ Agent has no other path
Proves executed payload matches approved payload	✕ Decision and execution are separate	✕ No binding mechanism	✓ Hash in signed permit
Prevents replay of identical valid requests	✕ Each evaluation is independent	✕ Application-layer concern	✓ Single-use permit ID
Agent runtime holds no downstream credentials	✕ Credentials in runtime	Depends on config	✓ Forward-time retrieval
Works across agent frameworks	Per-framework adapter needed	✓ Framework-agnostic	✓ HTTP-level integration
Sub-millisecond decision latency	✓ In-process	✓ Pass-through	Depends on policy complexity

The reality is that most production systems will use multiple layers. Framework middleware is excellent for low-risk tool calls, prompt filtering, and observability. API proxies centralize traffic management. The execution rail is specifically for actions where you need the guarantee: this exact mutation, once, with these exact parameters, and nothing else.

Building for the mutation path

The agentic AI ecosystem is maturing fast. Frameworks are stabilizing. Runtime governance toolkits are shipping. Regulatory requirements are crystallizing. This is positive progress — the more layers of defense available, the safer agent systems become.

But there is a category of action where defense-in-depth is not enough. Where the action is financially irreversible, compliance-critical, or safety-sensitive, you need the governance and the execution to be the same system. Not a sidecar that logs what happened. Not a middleware that approved the intent. A controlled path that ensures the approved action is the one that executes, exactly once, with no credential leakage.

That is what we built IntentChain to be: the execution rail for high-stakes agent actions. Not a replacement for runtime governance — a complement that closes the gap between decision and outcome.

Where to start

If you are building agent systems that touch money, patient records, infrastructure state, or any mutation where "oops, the agent did it twice" has real consequences:

Audit your credential topology. Which agents hold which downstream API keys? What is the blast radius of each?
Identify your irreversible actions. Payments, transfers, refunds, account state changes, production deployments. These are your execution rail candidates.
Separate the approval from the execution. If the system that checks policy is not the same system that performs the action, map the gap. What happens in between?

IntentChain supports both native actions (canonical endpoints like money.payment) and mirrored profiles (tenant-uploaded OpenAPI specs that the gateway validates and forwards). Start with one governed action and expand.

The question in 2026 is not whether agents need governance — that debate is settled. The question is whether the governance can prove the outcome, not just the decision. For the actions that matter most, the answer requires an execution rail.

Why runtime governance isn't enough — and what an execution rail actually changes