Skip to content

Infrastructure

Sovereign Trading Infrastructure: Why the Next Generation of HFT Will Run Inside Enclaves

Bybit, FTX, Mt. Gox: fast infrastructure, none of it verifiable. TEEs, MPC, and AI agents are converging to fix that.

19 min
#sovereign-infrastructure #tee #mpc #ai-agents #hft #institutional-trading #zerocopy #manifesto

ZeroCopy is built on a thesis that feels obvious in retrospect and was treated as impractical for years: the most important property of trading infrastructure is not speed - it is verifiability.

Speed is necessary. A signing infrastructure that cannot keep up with order flow is not a signing infrastructure. But speed without verifiability is the architecture that lost 1.5billionatBybit,1.5 billion at Bybit, 8 billion at FTX, 625millionatRonin,and625 million at Ronin, and 450 million at Mt. Gox. Every one of those firms had fast infrastructure. Not one of them had infrastructure that made it possible to prove, in real time, that the system was doing what it claimed to be doing.

The firms that will own institutional crypto trading in the next decade are not the ones with the best signals. Signals are arbitraged. They are the ones whose infrastructure is designed so that “trust us, it’s secure” is replaced by “verify it yourself, here is the cryptographic proof.” This post is the argument for why that shift is happening, what it requires, and how ZeroCopy is building it.


Section 1: The Custody Crisis

The history of crypto trading is in large part a history of custody failures. Not security theater failures - not firms that failed to install a firewall or use strong passwords. Architectural failures: systems built on the premise that if you keep the keys somewhere hard to find, they are secure. They were not.

Bybit, February 2025: $1.5 billion. The largest single crypto theft in history. The attack vector was not a brute-force compromise of the signing infrastructure. It was a supply chain compromise of the Safe multisig frontend - a JavaScript library that Bybit’s operators trusted was showing them what they thought they were signing. The keys were properly secured. The interface that humans used to review and approve transactions was compromised. The operators signed transactions they believed were routine transfers to attacker-controlled addresses.

The lesson is not “rotate your keys more frequently.” The lesson is that in a system where human operators must review transaction details through software interfaces, the security of the entire signing operation is bounded by the security of the viewing interface. If the interface can lie about what you are signing, the keys are irrelevant.

FTX, November 2022: $8 billion. Not a technical hack. A discretionary failure: a small number of individuals had the access and the authority to move funds without checks that would have made those moves visible or reversible. The infrastructure did exactly what it was designed to do. The design lacked enforced, programmatic limits on who could move how much where.

The lesson: “we trust our people” is not a security architecture. The question is whether the infrastructure makes it possible to do the wrong thing. FTX’s infrastructure made it trivially possible for a small group to move billions in customer funds. Programmatic limits with cryptographic enforcement would not have prevented the decision to misuse the funds - but they would have made the decision visible at the time it was made rather than months later.

**Ronin Bridge, March 2022: 625million.Aninevalidatormultisigprotecting625 million.** A nine-validator multisig protecting 625 million, with five validators needed to authorize withdrawals. The attacker compromised five of the nine validators - which turned out to include four validators operated by the same organization (Sky Mavis) and one operated by the DAO they controlled. The multisig appeared to provide distributed governance. In practice, it concentrated control.

The lesson: the security model of a multisig is not just the threshold (5-of-9) but the independence of the key holders. If five of nine validators can be controlled through one organization, the effective threshold is 1-of-1. This is not a validator problem - it is a key distribution problem that requires architectural attention, not just operational attention.

Mt. Gox, 2014: $450 million. The grandfather of custody failures. Private keys stored in a hot wallet on a server that was directly internet-accessible. The keys were extracted over a period of years through a series of unauthorized accesses that the exchange’s accounting systems failed to detect because the balance reported by the exchange software was not compared against the blockchain state.

The lesson is the oldest one: trusting the internal accounting of a system to verify the security of that system’s assets is circular. The only authoritative record is the blockchain.

The common thread across all four failures is not complexity. FTX was relatively simple - it was straightforward fraud with access to a poorly controlled system. Bybit was sophisticated - a multi-stage supply chain attack. Ronin was an architectural design failure. Mt. Gox was operational neglect. The common thread is the absence of cryptographic verification: in each case, the people who needed to know whether the system was behaving correctly were relying on software outputs they could not independently verify, or were not monitoring at all.

The solution is not better passwords, better employees, or better operational procedures, though all of those help at the margin. The solution is an architecture in which the behavior of the signing infrastructure is cryptographically verifiable in real time, and deviations from expected behavior are detectable by anyone with the attestation key - not just by the operators.


Section 2: The Three Revolutions

Three technologies have matured simultaneously to make sovereign trading infrastructure possible. No single one solves the problem. All three together produce a system that addresses the custody crisis at the architectural level.

Revolution 1: Trusted Execution Environments

Hardware-isolated execution is no longer an experimental technology. AWS Nitro Enclaves have been production-grade since 2020. Intel SGX is shipping in enterprise Xeons. AMD SEV-SNP is available on bare metal and select cloud instances. The core properties these technologies provide have been discussed at length in Post 1 of this series. For the manifesto argument, the key point is:

TEEs remove the cloud provider from your trust chain.

This is a non-obvious but foundational shift. When your signing keys live in an EC2 instance, AWS has access to them. This is not paranoia - it is the technical reality of how EC2 instances work. AWS employees with the appropriate internal access can read the memory of your running instance. This is why AWS built Nitro Enclaves: specifically to create a class of EC2 workloads where AWS employees cannot access the execution environment.

When your signing infrastructure runs in a Nitro Enclave with keys sealed to PCR0, the following becomes true: even a full compromise of your AWS account - every IAM user, every access key, every session token - does not give an attacker access to your signing keys. The keys are sealed in hardware. They can only be unsealed by an enclave running the exact code they were sealed against. Changing the code breaks the seal.

The attestation document extends this further: anyone, at any time, can verify that the enclave running your signing infrastructure is running the code it claims to run. The verification is cryptographic. It requires no trust in you, in AWS, or in any other party.

Revolution 2: Multi-Party Computation

Multi-party computation (MPC) for key management solves the single-point-of-compromise problem at the key level. In an MPC scheme, a private key is never fully assembled in any one place. Instead, key shares are distributed across multiple parties, and a signing operation is performed as a distributed computation where each party contributes their share. The full key never exists as a single value in memory - only the signature output exists.

The cryptographic protocols that make this work (threshold ECDSA, particularly GG20 and CGGMP21) have matured significantly. Production MPC systems like Fireblocks, Qredo, and ZeroCopy’s own implementation can perform a threshold signature in 10-100ms depending on the number of parties and the protocol variant.

MPC eliminates the “if you get the key you get everything” attack surface. To steal from an MPC-protected wallet, you need to compromise a threshold of key holders simultaneously. If the key holders are geographically distributed, operationally independent, and run on isolated infrastructure, achieving this simultaneously is not impossible but is dramatically harder than compromising a single signing server.

The combination of MPC and TEEs is particularly powerful: key shares stored in enclaves, signing computation performed in a distributed TEE protocol. The key shares are sealed to specific enclave measurements. Extracting a key share requires not just compromising the host server but breaking the TEE’s hardware isolation.

Revolution 3: AI Agents for Operational Complexity

The first two revolutions address the key management and signing layer. The third addresses the operational layer - the cognitive work that humans currently do to manage multi-venue trading operations.

Running 12 venues simultaneously, synthesizing macro signals, monitoring for regime changes, reviewing execution quality, preparing position summaries for principal review - this is the work that has historically required human analysts. AI agents, particularly those with persistent memory architectures (see Post 3), are now capable of performing this synthesis work reliably at scale.

This matters for sovereign infrastructure because the operational layer has historically been where security controls break down in practice. The Bybit attack targeted the human operator’s signing review interface precisely because humans are the bottleneck in the approval workflow. An AI agent that synthesizes the context for a human decision reduces the cognitive load on the human, which reduces the probability of the human approving something they should reject.

The agent is not autonomous. It does not execute capital decisions. It drafts the memo, structures the context, flags the anomalies, and presents the decision to the human. The human, with a well-structured memo rather than raw interface outputs, is better equipped to identify a compromised interface showing false information.

Three revolutions, three layers: TEEs for key isolation and provability, MPC for key distribution, AI agents for operational assistance within guardrails.


Section 3: What Sovereign Infrastructure Looks Like

The argument is stronger with a concrete illustration. Here is what a trading operation looks like at a Sovereign-Infra firm versus a Legacy-Infra firm, walking through each stage of the order lifecycle.

Stage 1: Order Generation

At a Legacy-Infra firm, the signal model runs on a cloud VM. The model’s outputs - “buy 10 BTC at market, confidence 0.73” - go to the execution engine via an internal API. The signal model is unattested. An auditor who wants to verify that the model running today is the model described in the strategy document has no cryptographic basis for that verification.

At a Sovereign-Infra firm, the signal model runs in an attested execution environment. Its PCR0 value is a public commitment to the exact model binary. An LP who wants to verify that the strategy running today is the strategy they evaluated provides a nonce and receives an attestation document. The verification is immediate and cryptographic.

Stage 2: Risk Check

At a Legacy-Infra firm, the risk check is software running on a server. A misconfigured rule, a bug in the risk engine, or an adversarially crafted input can potentially cause the risk check to pass a trade it should fail. The audit trail is log files, which are writable by anyone with server access.

At a Sovereign-Infra firm, the risk check runs inside the same enclave cluster as the signing infrastructure. The policy engine is sealed to a specific code hash. Changing the risk limits requires deploying a new enclave image with a new PCR0, which breaks the key seal until an authorized migration is performed. The risk limits are cryptographically enforced, not just operationally enforced.

Stage 3: Signing

At a Legacy-Infra firm, signing happens with keys stored in a secrets manager, a hardware security module, or (in worse cases) directly in environment variables. The signing operation is opaque - you can see that a signature was produced but not necessarily verify that the signing process followed its policy.

At a Sovereign-Infra firm, signing happens in a Nitro Enclave. Every signing operation produces a log entry that is signed by the enclave’s ephemeral key. The enclave’s ephemeral key is verified by the attestation document. A third party can verify: this signature was produced by an enclave running code X, given input Y, at time T. The signing process is not just audited after the fact - it is provable.

The benchmark numbers from ZeroCopy’s production system: p50 signing latency 42µs, p99 87µs, over 3,000x faster than AWS KMS on the equivalent operation. Speed and verifiability are not in tension. The enclave is both faster and more verifiable than the alternatives.

Stage 4: Settlement

At a Legacy-Infra firm, settlement involves trusting a counterparty’s reconciliation. Whether you are settling on-chain through a custodian, or off-chain through an exchange’s internal accounting, you are trusting the counterparty’s representation of your balance.

At a Sovereign-Infra firm, settlement conditions are encoded in smart contracts where possible. On-chain settlement with programmable conditions removes the counterparty entirely from specific transaction types. For exchange-settled trades, the sovereign approach is to verify independently against on-chain state rather than accepting the exchange’s balance report as authoritative.

Mt. Gox would not have hemorrhaged $450 million over years if the exchange’s internal balance had been continuously reconciled against on-chain state by an independent process.

Stage 5: Reconciliation

At a Legacy-Infra firm, reconciliation is an operational process - teams of people checking ledgers against exchange records. It is lagged, it is expensive, and it catches fraud or errors in weeks or months.

At a Sovereign-Infra firm, reconciliation is programmatic and continuous. The attestation log from the signing infrastructure produces a deterministic record of every signing operation: exactly what was signed, when, by which enclave, with what risk check outcome. An independent reconciliation process (running outside the trading infrastructure) can compare the attestation log against exchange records and blockchain state in near-real-time. Discrepancies surface in minutes, not months.

The Ronin bridge attack extracted $625 million over a five-day period. A continuous reconciliation process comparing expected withdrawals (derivable from the attestation log) against actual on-chain withdrawals would have detected the anomaly within hours of the first unauthorized transaction.


Section 4: The Benchmark Reality

ZeroCopy’s production numbers, because every argument of this kind requires specifics:

Operation                        p50        p99       p999
──────────────────────────────────────────────────────────
ECDSA P-256 sign (in enclave)    42µs       87µs     145µs
End-to-end (app to response)    127µs      210µs     380µs
Risk check (policy engine)       18µs       31µs      67µs
Attestation document fetch       4.2ms      8.1ms    12ms

Comparison baselines:
AWS KMS (same region)           ~160ms     ~350ms    (highly variable)
Traditional HSM (nCipher)        ~3ms       ~8ms
Fireblocks API (network)        ~180ms     ~400ms
Self-hosted key (no HSM)         ~0.3ms     ~1ms     (no audit trail)

The 3,095x advantage over AWS KMS on p50 is the number that gets quoted, and it is accurate. But the more important comparison is variance. AWS KMS p99 at 350ms means a production system making 1,000 signing requests per day will experience approximately 10 signing operations per day where the latency is over 350ms. In a live market with fast-moving prices, those 10 operations are statistically concentrated at moments of high volatility - exactly when latency costs the most.

The Nitro Enclave’s p999 at 145µs means the worst-in-1,000 signing operation takes under 0.15ms. The entire latency distribution fits inside the variance of a single AWS KMS p50 observation.

The comparison to a self-hosted key (no HSM, ~0.3ms, no audit trail) makes the value proposition explicit: the enclave is 7x slower than an unprotected key on the same server, and produces a cryptographically verifiable audit trail for every operation. The 7x latency premium is the cost of provability. For most trading applications operating at sub-second cadence, it is not a meaningful tradeoff. You are buying verifiability for 85µs per signing operation.

The HSM comparison (3-8ms) shows that TEE-based signing is not just more auditable than traditional HSMs - it is 20-70x faster. The latency of an HSM is a consequence of its architecture (FIPS-certified, dedicated hardware with conservative firmware) and its interface (PKCS#11 over a local bus or network). A Nitro Enclave running modern ECDSA implementations is faster because it is a general-purpose CPU running optimized code, with the security properties provided by hardware isolation rather than dedicated hardware.


Section 5: The AI Convergence

The infrastructure layer - TEEs, MPC, provable signing - is the foundation. What makes the foundation interesting in 2026 is what runs on it.

The next wave is not AI trading strategies. Alpha-generating ML models are increasingly commoditized: the techniques are published, the data is available, and a well-funded team can implement a competitive signal model in months. The differentiation in systematic trading is not in who has the best model - it is in who has the operational infrastructure to run the best model continuously, at scale, with the risk controls and auditability that institutional capital requires.

This is where AI agents operating within sovereign constraints change the picture.

The operational complexity of running a professional systematic trading desk is enormous. Monitoring 12 venues simultaneously for anomalies. Synthesizing macro context against current positions. Reviewing execution quality against model expectations. Preparing position summaries for principal review. Generating post-trade analysis. Managing strategy lifecycle decisions - when to pause a strategy, when to adjust parameters, when to retire an approach. These tasks require sustained human attention, and that attention is the bottleneck.

AI agents with persistent memory can handle the synthesis and monitoring functions at a quality level that was not achievable 24 months ago. The Letta-based Sovereign Consciousness agent operating at ZeroCopy synthesizes multi-source context at 30-minute cadence, maintains a working memory of strategy performance patterns going back months, and drafts human-review memos that structure the decision a principal needs to make rather than dumping raw data.

The constraint that makes this safe is the same constraint that makes it architecturally sound: the agent has no path to direct execution. Every capital decision from the agent passes through the BDI policy engine - a deterministic guardrail layer that enforces mission parameters regardless of the agent’s reasoning. The agent can produce a sophisticated argument for why a position limit should be temporarily exceeded. The policy engine does not read the argument. It reads the number.

The architecture from synthesis to execution:

Sovereign Consciousness (Letta)
→ synthesizes market context, drafts directives
→ BDI Policy Engine (deterministic guardrails)
→ approved directives to Human Review (Command Center)
→ human approves/rejects
→ Sovereign Factory (execution routing)
→ Signing Enclave (Nitro, attested, p50 42µs)
→ Exchange API / on-chain settlement
→ Attestation log (verifiable audit trail)

Every step in this chain is either provable (signing enclave, attestation log) or auditable (BDI policy engine logs, human approval record). The chain from agent synthesis to signed execution is cryptographically traceable.

The AI convergence thesis is this: the operational complexity of multi-venue systematic trading currently requires either a large human team (expensive, error-prone) or a simplified single-venue strategy (lower return potential). AI agents operating within sovereign infrastructure - with guardrails that are cryptographically enforced, not just operationally enforced - make it possible to run the operational complexity of a large team with a small one, while maintaining the auditability that institutional capital requires.

This is not science fiction. It is running in production. The pieces are assembled. What remains is the maturation of the individual components and the integration work to connect them into a seamless whole.


Section 6: Your Move

The firms that will win the next decade of institutional crypto trading are not the ones with the best signals. They are the ones that can operate with:

The lowest counterparty risk. Not “we use a reputable custodian” but “our signing keys are sealed in hardware, our risk limits are enforced cryptographically, and here is the attestation document proving it.”

The highest auditability. Not “we have audit logs” but “every signing operation is provably traceable to a specific enclave running specific code, and an LP or regulator can verify this independently.”

The most efficient capital deployment. Not “we have good risk management” but “our AI agents maintain continuous operational oversight at a cost structure that is independent of portfolio complexity.”

Sovereign infrastructure is the foundation for all three. It is not a feature you add to an existing system - it is an architectural choice you make when you build the system, or a rearchitecture you commit to when the existing system’s limitations become clear.

The technology is not experimental. AWS Nitro Enclaves are production-grade. MPC threshold signing protocols are deployed at institutional scale. LLM-based workflow agents with persistent memory are running in production. The integration work is non-trivial but tractable. The operating model that combines these three layers into a coherent sovereign infrastructure is what ZeroCopy is building.

The question is not whether the industry will adopt this architecture. It will. The Bybit attack alone represents a 10-year setback in institutional confidence in crypto custody, and institutional confidence is required for institutional capital. The attestation model solves this in a way that no amount of operational improvement can. The question is whether you will build on this foundation before or after your competitors.

If you are building trading infrastructure and want to discuss the architecture: the /work/for-funds page describes how ZeroCopy works with systematic funds and prop desks. If you want to understand the technical details before that conversation, subscribe to the newsletter - this series continues with the practical implementation guide for each component.

If you are an LP evaluating a systematic trading firm and want to understand whether their custody architecture is defensible, the attestation model described in Post 2 of this series is the basis for a question you can ask that no one in the industry is currently asking. That question is: “Can you show me a live attestation document for your signing infrastructure?” If they cannot, you are trusting their operations rather than verifying their architecture. You should know that going in.

The future of institutional trading infrastructure is sovereign, attested, and AI-augmented. The firms that understand this today will define what institutional trading looks like in 2030.


Nikhil Padala is the founder of ZeroCopy Systems, which builds Nitro Enclave-backed signing infrastructure and sovereign AI trading agents for institutional trading operations. ZeroCopy is building the infrastructure layer that makes trading operations cryptographically verifiable. For engagements with systematic funds and prop desks, see /work/for-funds. For updates on sovereign trading infrastructure as the space evolves, subscribe to the newsletter.

Continue Reading

Enjoyed this?

Get one deep infrastructure insight per week.

Free forever. Unsubscribe anytime.

You're in. Check your inbox.