Security
Sovereign infrastructure for AI agents handling capital: a practitioner's reference
What production-grade infrastructure for autonomous agents that move money actually requires: human-oversight controls, attestation, deterministic safety nets, and the Article 14 deadline.
Here is what this piece gives you: a reference for the infrastructure an AI agent needs before it is allowed near real capital or a regulated domain. The list is specific. The failure modes are real. The August 2, 2026 deadline for EU AI Act Article 14 compliance is fixed.
If you are building AI agents that execute trades, approve transactions, manage customer funds, or take any action with financial or regulatory consequence, this is the infrastructure checklist you need to read before your next deploy.
The problem: non-deterministic agents in deterministic domains
The thing most teams building agentic systems miss is the mismatch at the boundary. The agent is probabilistic: its outputs vary based on context, on the LLM’s sampled token sequence, on prompt ordering. The domain the agent operates in is not. Capital flows have ledger entries. Regulatory filings have timestamps. Custody operations have audit trails. Contracts have settlement finality.
When a non-deterministic system takes actions in a deterministic domain, the burden on the infrastructure around the agent is higher than most teams have designed for. The agent is going to do something unexpected. The question is whether your infrastructure noticed, logged it in a tamper-evident way, and stopped it before it became a loss event.
Most teams have built on duct tape. The kill switch is a comment in a config file. The audit log is a database table the same process writes to. The policy engine is a set of if-statements in the agent’s prompt. The attestation story is “we trust our EC2 instance.”
This works fine until it doesn’t. The failure event is usually sudden, usually at a time when no human is monitoring, and usually involves a combination of factors nobody anticipated in isolation.
The Article 14 deadline is not a soft target
The EU AI Act Article 14 mandates human oversight for high-risk AI systems. Enforcement of the high-risk provisions begins August 2, 2026. For AI systems in scope (which includes systems used for autonomous financial decisions affecting individuals), the human oversight controls are not optional after that date.
The specific engineering requirements Article 14 imposes, paraphrased: the ability for a natural person to understand what the system is doing, to intervene and override it, and to stop it. These are not product features. They are compliance requirements with legal consequence.
Annex III of the Act lists the high-risk categories. Financial services are included, specifically systems used for credit risk assessment and financial instrument trading that affect natural persons. If your AI agent participates in decisions about who gets capital, who gets credit, or what positions are taken on behalf of clients, you are in scope.
The most common mistake teams make: assuming “we’ll be compliant by August” without mapping the Act’s engineering requirements to specific infrastructure decisions. Most teams that have done the mapping find that one or more of the three core requirements (understand, intervene, stop) is either unimplemented or implemented in a form that would not survive a regulator’s review. The gap analysis is not optional. You need to know where you stand before the deadline, not after.
What follows describes what those requirements mean in engineering terms, and what the rest of the sovereign infrastructure stack looks like when you build it correctly.
Section 1: Human-oversight controls
Article 14’s three requirements (understand, intervene, stop) map to specific engineering primitives. The order matters.
Understand: explainability at the action boundary
Understanding what the agent is doing requires more than an audit log. It requires that every consequential action the agent takes be logged in a structured, human-readable form at the point of decision, before execution. This means:
- The agent’s stated reasoning (or a structured summary of inputs, tool calls, and outputs) is captured per action.
- Actions are classified by type and severity at log time.
- The log is written before the action executes, not after. A log entry that says “the agent sent $50,000 to address X” written after the transfer does not help you understand it in time to intervene.
This is harder than it sounds. Most LLM agents emit reasoning in natural language that is not structured for audit. Building an action-logging layer that captures the right fields (timestamp, agent ID, action type, parameters, reasoning summary, policy evaluation result) requires deliberate design.
A useful test: pick a consequential action your agent took last week. Can you reconstruct, from your logs alone, exactly what inputs the agent saw, what it decided, what the policy layer said, and what it did? If the answer is “mostly, if I also look at the database” or “I’d need to correlate three separate systems,” your explainability layer is not audit-grade. Audit-grade means one log query returns the full decision chain for any past action.
Intervene: override mechanisms that work under load
An override mechanism that works in a dev environment during business hours is not the same as an override mechanism that works at 3am when the agent is mid-sequence on a high-volume trading day.
For an override to be real, three conditions must hold: it must be reachable (the operator can trigger it from wherever they are), it must be acknowledged by the system within a defined time bound (not “will take effect on the next cycle”), and it must be persistent (a restart doesn’t undo it).
The common failure mode: the override sets a flag in a Redis cache. The agent reads the cache at startup. The cache expires. The agent resumes. Nobody noticed.
Correct design: override state is persisted in durable storage, checked at every action decision point (not just at startup), and the agent cannot clear or expire the override without explicit operator action.
Stop: the difference between a kill switch that works and one that’s theater
A kill switch that works has two properties that most implementations lack: it is deterministic (triggering it will stop the agent, regardless of what the agent is doing at the time), and it is complete (it stops all downstream effects, not just the LLM inference loop).
The incomplete kill switch is the common case. You stop the agent process. The agent had already submitted 40 orders to the exchange before it stopped. The exchange is executing those orders right now. The kill switch did not stop the orders, it stopped the agent. That’s not a kill switch. That’s a process manager.
A real kill switch architecture for an agent handling capital requires:
- A halt state that prevents new actions from initiating (the agent cannot start a new action if halted).
- A drain procedure that handles in-flight actions (orders already submitted go through a defined sequence before they are abandoned or cancelled).
- A propagation path that reaches every component that can initiate an external effect: the LLM inference layer, the tool execution layer, the order submission layer, the notification layer.
The human-in-the-loop vs human-on-the-loop vs human-in-command distinction matters here. “Human-in-the-loop” means a human approves every action. That’s not scalable for high-frequency systems. “Human-on-the-loop” means the agent acts autonomously but a human monitors and can intervene. “Human-in-command” means the human can halt, override, or redirect the agent’s goal state at any time. Article 14 requires human-in-command for high-risk systems. Human-on-the-loop may satisfy it for lower-stakes actions with sufficient monitoring. Human-in-the-loop is the safest but has operational overhead. Know which mode you’re operating in before the regulator asks.
Section 2: Action authorization
The agent’s policy constraints must not live inside the agent. This is the single most important architectural rule for agents that handle capital.
Why the policy layer must be separate from the model
If the policy rules are expressed in the system prompt, an adversarial input can modify them. Not by “jailbreaking” in the dramatic sense. By constructing a context in which the model’s weighting of competing instructions causes it to de-prioritize the policy rules. Prompt injection is not theoretical. It has been demonstrated in production systems.
A policy layer that runs in the same process as the LLM inference has the same attack surface. A policy layer that runs as a separate service, validates actions against a configuration the agent cannot modify, and enforces limits at the API boundary is structurally different.
The policy layer should be:
- Separate: a distinct service or hardware boundary from the agent. The agent submits proposed actions. The policy layer approves or rejects them.
- Deterministic: given the same proposed action and the same policy configuration, the policy layer always returns the same result. No LLM probabilism here.
- Tamper-evident: changes to policy configuration are logged and require authorization. The agent cannot modify its own policy.
Allowlists, rate limits, and per-action approval gates
In practice, the policy layer enforces three categories of constraint:
Allowlists: the agent can only take actions from a defined set. It cannot invent new action types. The allowlist is defined at deploy time and requires explicit updates to expand. An agent with an allowlist of 20 action types cannot spontaneously decide to call an external API not on the list.
Rate limits: per-action-type caps that prevent runaway behavior. If the agent normally submits 10 orders per minute, a rate limit of 50/minute means a prompt-injected agent or a stuck loop cannot submit 10,000 orders before anyone notices.
Per-action approval gates: for high-stakes action types (withdrawal above a threshold, position size above a threshold, communication to an external party), require explicit human approval before execution. The agent queues the action, a human reviews and approves or rejects, then execution proceeds. This is operationally expensive for high-frequency actions, so gate only the actions where the approval cost is justified by the risk.
Section 3: Attestation and audit trail
Logging is not enough. Logging proves what you recorded. It does not prove what the code was actually doing when it ran.
The gap between “we logged it” and “we can prove it”
A log file on an EC2 instance can be modified by anyone with instance access, the same access an attacker would have after a compromise. A database row can be overwritten. An audit trail that lives on the same infrastructure the agent runs on is only as trustworthy as that infrastructure’s security.
For regulatory purposes, “we logged it” is insufficient if you cannot also prove that the log was not modified after the fact. For serious incidents, you may need to demonstrate to a regulator or counterparty that a specific version of code was running at a specific time and produced specific outputs.
A concrete scenario: your agent approved a withdrawal at 2:47am that a customer is now disputing. Your audit log shows the approval. The customer’s counsel asks: can you prove the version of code that approved the withdrawal had the correct policy configuration loaded? Can you prove the log was not modified after the incident was discovered? Without attestation and hash-chained logs, the honest answer is “no.”
This is what attestation solves.
Hardware attestation: what it does and what it doesn’t
Trusted Execution Environments (TEEs) provide hardware-backed attestation. AWS Nitro Enclaves are the most accessible production implementation of this for cloud workloads.
The mechanism: when an enclave starts, the Nitro hypervisor measures every file in the enclave image and records those measurements in Platform Configuration Registers (PCRs). The measurements are signed by AWS’s Nitro Attestation CA, a chain of trust rooted in AWS hardware. The resulting attestation document is a CBOR-encoded signed artifact that cryptographically binds: the enclave image (via PCR0), the signing request, and a public key generated inside the enclave.
What attestation gives you: a verifiable proof that a specific version of code was running at the time a signing operation occurred. If someone asks “was this the production binary?” you can prove it was, because the PCR0 measurement would be different for any modified binary.
What attestation does not give you: proof that the inputs to the code were correct, that the agent’s reasoning was sound, or that the policy layer was configured correctly. Attestation proves the code; it doesn’t prove the behavior. The behavior proof comes from the combination of attestation + tamper-evident audit logs.
Tamper-evident audit logs
A tamper-evident audit log chains each entry to the previous one. The most straightforward implementation: each log entry includes a SHA-256 hash of the previous entry. To modify entry N, you’d need to re-hash all subsequent entries, which produces a different final hash. That’s detectable.
For regulatory-grade audit trails, combine: structured log entries (not free text), hash chaining, and periodic anchoring to an external tamper-evident store (a blockchain’s immutability can serve this role; so can a separate append-only log service the agent cannot write to). The goal is that any modification after the fact produces a detectable inconsistency.
Section 4: Deterministic safety nets
The LLM is non-deterministic. The safety layer must not be.
Circuit breakers on capital flow
A circuit breaker monitors a condition and halts execution when the condition is breached. For agents handling capital, the canonical conditions:
- Drawdown threshold: total capital loss exceeds N% in a rolling window. Halt trading. Require human review before resuming.
- Consecutive loss limit: N consecutive losing positions. Not just a risk metric. This is a signal that market conditions or the agent’s behavior have changed in a way that merits human review.
- Velocity limit: more than N capital-moving actions in M minutes. An agent that suddenly starts submitting 10× its normal action volume has probably been prompt-injected or entered a loop.
The critical implementation detail: the circuit breaker’s state must survive process restart. An agent that trips a circuit breaker, restarts (due to an unrelated failure), and resumes with no memory of the trip is not a circuit breaker. It’s a delay. State must be persisted durably.
Bounded action spaces
Every action the agent can take should have a defined maximum effect. Maximum order size. Maximum position. Maximum withdrawal. These are not policy-layer rules. They are hard limits enforced at the execution layer, below the policy layer. Even if the policy layer is compromised, the execution layer enforces the bounds.
The difference between a policy rule and an execution limit: a policy rule can be overridden by changing the policy configuration. An execution limit is hard-coded or configured at deploy time by a different team, under a different authorization model, and cannot be changed by agent operations.
Dead-man switches
A dead-man switch halts the agent if a required condition stops being true. The simplest implementation: the agent must report a heartbeat at a defined interval. If the heartbeat stops (the agent is stuck, has crashed, or is in a loop), the system triggers a halt.
More sophisticated: the agent must produce a signed token every N seconds as proof of health. If the token stops, or if the token’s signed properties don’t match expected state, the system halts. This prevents a zombie agent (a process that’s technically running but not behaving correctly) from continuing to operate.
The failure mode this prevents is real: an agent whose main loop enters a tight retry cycle on a failing tool call. It’s technically alive (process is running, heartbeat from the OS is fine) but not doing useful work, and it may be accumulating partial state. A dead-man switch that checks not just “is the process running” but “did the agent complete its last N iterations within expected time bounds” catches this. If iteration N is still in progress after 3× the expected duration, halt and alert.
The compounding risk: multiple agents sharing state
Most agentic architectures in 2026 are multi-agent. An orchestrator dispatches subtasks to specialized agents. Each agent has its own action set, its own rate limits, and its own audit trail, but they share state through a common memory or database layer.
The safety infrastructure above needs to handle multi-agent correctly. A kill switch that halts the orchestrator does not automatically halt the subagents if they have independent execution loops. A circuit breaker that monitors total capital exposure needs to aggregate across all agents, not just the one that tripped the breaker. An audit trail that lets you reconstruct what happened needs to link the orchestrator’s decisions to the subagent actions that resulted from them.
This is an area where most teams have gaps. The single-agent safety patterns are understood; the multi-agent composability of those patterns is not. Design for it before you ship the multi-agent architecture, not after.
Section 5: A worked example
The following sketch describes the architecture I’m building at ZeroCopy Systems, described as a first-person account rather than as a general recommendation.
The signing service runs inside an AWS Nitro Enclave. The enclave image is PCR0-pinned. The KMS key policy will only decrypt for an enclave whose PCR0 matches the measured production binary. Any modification to the enclave code produces a different PCR0, which means the enclave cannot access the signing key. The signing key never leaves the hardware boundary.
The policy engine is a separate process on the parent instance. It receives proposed signing requests from the agent, evaluates them against a policy file (max drawdown, position limits, forbidden action types), and either approves or rejects. The policy file is versioned, and changes require authorization from a separate key. The agent cannot modify the policy.
The benchmark signing latency is 42µs measured at P50 in a benchmark suite on dedicated hardware. This is a benchmark-suite result; production validation is ongoing as the system scales. The comparison to AWS KMS (130ms) is real: KMS requires a network round-trip to a service you don’t control; the enclave is local.
The audit trail: every signing request, the policy evaluation result, and the signing outcome are written to a hash-chained log before the result is returned to the caller. The log is shipped to a separate store that the agent cannot write to.
This is roughly the architecture that a team building an agent-based trading or custody system should evaluate. It’s not unique to ZeroCopy. The components (Nitro Enclaves, a separate policy engine, hash-chained logs) are available to any team. The work is in the integration and in the operational discipline to keep them working correctly under production conditions.
Section 6: Readiness checklist
Rate your system against these before your next deploy. For high-risk AI systems under the EU AI Act, this is the minimum viable set.
-
Kill switch exists and is tested. Can you halt the agent right now, from your phone, without logging into the production host? If you halted it, would it actually stop placing orders, or would in-flight actions continue?
-
Kill switch state persists across restarts. If the agent process crashes and restarts automatically, does it re-read the halt state before resuming, or does it start fresh?
-
Policy layer is separate from the model. Are your agent’s action limits enforced by a component the agent cannot modify, or are they expressed in the system prompt?
-
Audit log is written before action execution. Does your log capture the intent (what the agent decided to do) before the action executes, not just the outcome after?
-
Audit log is tamper-evident. Could you prove, in an external review, that your audit log was not modified after the fact? Does it use hash chaining or equivalent?
-
Attestation is in place for signing operations. If a signing operation is disputed, can you prove which version of code was running at the time?
-
Circuit breakers are wired to durable state. If the circuit breaker trips at 3am and the process restarts at 3:05am, does the agent resume or does it start in halted state?
-
Execution limits are below the policy layer. Is there a hard maximum order size or capital amount that is enforced at the execution layer, regardless of what the policy layer allows?
-
Override mechanism is reachable and time-bounded. Can an operator override the agent within a defined time bound (e.g., 60 seconds) from anywhere, without requiring production server access?
-
Dead-man switch is active. Does the system halt if the agent stops reporting health, rather than continuing to run with no oversight?
-
Human-in-command is formally defined. Do you know which actions require pre-authorization, which require post-notification, and which are fully autonomous? Is that definition written down and versioned?
-
Article 14 scope assessment is complete. Have you determined whether your system qualifies as high-risk under the EU AI Act? If you don’t know, the answer is probably “yes” if you’re moving money or making decisions that affect individuals. The enforcement date is August 2, 2026.
-
Incident response for agent misbehavior is documented. If the agent does something unexpected at 3am, does your on-call runbook cover AI-agent-specific failure modes (stuck loops, prompt injection, policy bypass), or does it only cover infrastructure failures?
-
Blast radius is bounded. If the agent is fully compromised (worst case), what is the maximum possible financial impact? Is that number acceptable? If not, lower the execution limits.
Closing
The infrastructure described here is not aspirational. Every component is available today: Nitro Enclaves, separate policy engines, hash-chained logs, circuit breakers with durable state, kill switches that propagate correctly. The work is in integrating them and in maintaining the operational discipline to keep them working as the system scales.
Most teams skip this work because it feels like overhead relative to building the agent itself. That calculation inverts the moment something goes wrong at scale, or the moment an EU regulator asks for your Article 14 compliance documentation.
If you want this assessed against your specific system, the Article 14 Gap Audit is the productized version of this checklist: a 10-day written gap analysis with a remediation roadmap. If your system is a trading infrastructure rather than an AI agent, the HFT Infrastructure Audit covers the latency path, failure modes, and operational gaps.
The ZeroCopy Systems case study describes the TEE signing architecture in more detail. The mpc-vs-hsm-vs-multisig post covers the custody key management decision framework for systems that handle assets directly.
External references
- EU AI Act Article 14 text: the primary source for the human-oversight requirements. Annex III lists the high-risk system categories.
- AWS Nitro Enclaves attestation documentation: the technical reference for PCR measurements, the attestation document format, and the verification procedure.
Continue Reading
Enjoyed this?
Get one deep infrastructure insight per week.
Free forever. Unsubscribe anytime.
You're in. Check your inbox.