Skip to content

Security

AWS Nitro Enclaves for Wallet Signing: A Hands-On Architecture Guide

End-to-end guide to AWS Nitro Enclave wallet signing: vsock proxy, KMS PCR attestation flow, cold vs warm latency breakdown, and production failure modes to avoid.

11 min
#aws-nitro #tee #enclaves #wallet-signing #custody #key-management #attestation

ZeroCopy Systems is built on AWS Nitro Enclaves. This is not a technology decision I made lightly - it represents a fundamental bet that hardware-attested computation is the right primitive for trading infrastructure that handles real capital. This post is the technical architecture document I wish existed when I started building on Nitro.

The core guarantee Nitro Enclaves provide: code running in an enclave is cryptographically attested by AWS hardware. If you modify the enclave code, the PCR (Platform Configuration Register) measurements change, and a KMS key policy that binds to the original PCR values will refuse to decrypt. An attacker who compromises your enclave’s parent instance - including your own AWS root account, within the bounds of the attestation model - cannot extract the signing key or impersonate the enclave.

Our benchmark: P50 42µs, P99 64µs for ECDSA secp256k1 signing on a dedicated core with a warm vsock connection. These are measured benchmark numbers; production validation is ongoing as the system scales.

What a Nitro Enclave Actually Is

An enclave is a stripped-down virtual machine running alongside your normal EC2 instance (the “parent instance”). It has:

  • No network access: no TCP/IP. The only communication channel is the vsock (Virtual Socket), a local channel between the enclave and its parent instance.
  • No persistent storage: the enclave’s filesystem is ephemeral. When it stops, all state is lost. Secrets must be loaded at startup from the parent via vsock.
  • No SSH access: you cannot SSH into an enclave. You cannot attach a terminal. You cannot see its process list from outside. Even the instance owner has no mechanism to inspect the running state.
  • No clock: the enclave has no direct access to wall-clock time. Time must be passed in by the parent if needed.
  • Cryptographically signed boot measurements: every file in the enclave image is measured during build. The measurements (PCR0-7) are embedded in the enclave image and signed by the Nitro hypervisor when the enclave starts.

The attestation document is the core security primitive. It is a CBOR-encoded document signed by the AWS Nitro Attestation CA, containing:

  • PCR0-7 measurements of the enclave image
  • A public key generated inside the enclave (the private key never leaves)
  • Module ID (the enclave’s identifier)
  • Timestamp

KMS evaluates this attestation document in its condition keys. A KMS key policy can say: “only allow decrypt if the requesting enclave has PCR0 = sha384-of-your-enclave-image”. If you change the enclave code, PCR0 changes, the condition fails, and KMS refuses the decrypt. The key is mathematically inaccessible to any code other than the original enclave binary.

The Complete Signing Architecture

Parent Instance (EC2 c5n.2xlarge)
├── Trade Processing Service (receives trade data via HTTPS)
├── vsock server/client (mediates parent ↔ enclave communication)
│   └── /dev/vsock
└── Nitro Enclave
    ├── Policy Verification Module (checks trade against policy rules)
    ├── KMS Client (uses attestation document for key access)
    ├── ECDSA Signing Module (constant-time secp256k1 implementation)
    └── (no network, no storage, no SSH)

AWS KMS
├── Key Policy: allow decrypt only if PCR0 == <known_enclave_hash>
└── Encrypted signing key (ciphertext stored in parent's filesystem)

Signing Flow:
1. Parent receives trade_data from trading system (HTTPS)
2. Parent sends {trade_data, encrypted_key_ciphertext} over vsock to enclave
3. Enclave verifies trade_data against policy rules
4. Enclave calls KMS with attestation document (proving its PCR0 identity)
5. KMS decrypts key only if PCR0 matches policy → returns plaintext key
6. Enclave signs trade_data hash using secp256k1
7. Enclave sends {signature, attestation_doc_hash} back over vsock
8. Parent returns {signature} to trading system

Step-by-Step Implementation

Building the enclave image:

# enclave/Dockerfile
FROM amazonlinux:2023

RUN yum install -y aws-nitro-enclaves-sdk-c openssl-devel cmake gcc

# Copy your signing application
COPY signing_server /usr/local/bin/signing_server
COPY policy_rules.json /etc/signing/policy_rules.json

# The enclave image must be deterministic for reproducible PCR measurements
# Avoid dynamic timestamps, random UUIDs, etc. in the build

EXPOSE 5000  # vsock port (not TCP - vsock uses port numbers differently)

CMD ["/usr/local/bin/signing_server"]
# Build the enclave image file (.eif)
nitro-cli build-enclave \
    --docker-uri signing-service:latest \
    --output-file signing-enclave.eif

# The output includes PCR0-8 measurements:
# PCR0: hash of the enclave image (code + data)
# PCR1: hash of the Linux kernel and boot ramfs
# PCR2: hash of the application image
# PCR8: hash of the enclave signing certificate (if signed)

KMS key policy for enclave-only access:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowEnclaveDecrypt",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/signing-service-role"
      },
      "Action": "kms:Decrypt",
      "Resource": "*",
      "Condition": {
        "StringEqualsIgnoreCase": {
          "kms:RecipientAttestation:PCR0": "YOUR_PCR0_HEX_VALUE_HERE",
          "kms:RecipientAttestation:PCR1": "YOUR_PCR1_HEX_VALUE_HERE",
          "kms:RecipientAttestation:PCR2": "YOUR_PCR2_HEX_VALUE_HERE"
        }
      }
    },
    {
      "Sid": "DenyNonEnclaveDecrypt",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "kms:Decrypt",
      "Resource": "*",
      "Condition": {
        "Null": {
          "kms:RecipientAttestation:PCR0": "true"
        }
      }
    }
  ]
}

The DenyNonEnclaveDecrypt statement is critical: it denies any decrypt request that does not include an enclave attestation document. This means even your root account, even an AWS admin, cannot decrypt the key without the attestation document - which requires the exact enclave code to be running.

The vsock server inside the enclave (Rust):

// enclave/src/main.rs
use vsock::{VsockListener, VsockStream, VMADDR_CID_ANY};
use aws_nitro_enclaves_sdk::kms::{KmsClient, DecryptRequest};
use secp256k1::{Secp256k1, SecretKey, Message};
use sha2::{Sha256, Digest};

const VSOCK_PORT: u32 = 5000;
const VMADDR_CID_PARENT: u32 = 3;  // Parent instance is always CID 3

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let listener = VsockListener::bind_with_cid_port(VMADDR_CID_ANY, VSOCK_PORT)?;
    println!("Enclave listening on vsock port {}", VSOCK_PORT);

    for stream in listener.incoming() {
        let stream = stream?;
        tokio::spawn(handle_connection(stream));
    }
    Ok(())
}

async fn handle_connection(mut stream: VsockStream) {
    // Receive signing request from parent
    let request: SigningRequest = read_frame(&mut stream).await.unwrap();

    // Step 1: Validate against policy
    if let Err(e) = validate_policy(&request.trade_data, &request.policy_rules) {
        send_error(&mut stream, e).await;
        return;
    }

    // Step 2: Call KMS with attestation document
    let kms = KmsClient::new_with_attestation().await.unwrap();
    let decrypt_result = kms.decrypt(DecryptRequest {
        ciphertext_blob: request.encrypted_key_ciphertext,
        ..Default::default()
    }).await;

    let signing_key_bytes = match decrypt_result {
        Ok(r) => r.plaintext.unwrap(),
        Err(e) => {
            // KMS refused - PCR mismatch or policy violation
            send_error(&mut stream, format!("KMS decrypt failed: {}", e)).await;
            return;
        }
    };

    // Step 3: Sign the trade data
    let secp = Secp256k1::new();
    let secret_key = SecretKey::from_slice(&signing_key_bytes).unwrap();

    let trade_hash = Sha256::digest(&request.trade_data);
    let message = Message::from_digest_slice(&trade_hash).unwrap();
    let sig = secp.sign_ecdsa(&message, &secret_key);

    // Step 4: Return signature to parent
    let response = SigningResponse {
        signature: sig.serialize_compact().to_vec(),
        enclave_timestamp: None,  // No clock in enclave
    };

    send_frame(&mut stream, response).await.unwrap();

    // Key material is dropped here - Rust's Drop trait zeroes the memory
    // via zeroize::Zeroize if your SecretKey implements it
}

The vsock client on the parent instance:

// parent/src/signing_client.rs
use vsock::{VsockStream, VMADDR_CID_HOST};
use std::time::Instant;

const ENCLAVE_CID: u32 = 16;   // Assigned at enclave start
const VSOCK_PORT: u32 = 5000;

pub struct SigningClient {
    // Keep connection open for latency - reconnect only on error
    stream: Option<VsockStream>,
}

impl SigningClient {
    pub async fn sign_trade(
        &mut self,
        trade_data: &[u8],
        encrypted_key: &[u8],
    ) -> Result<Vec<u8>, SigningError> {
        let t0 = Instant::now();

        let stream = self.get_or_connect().await?;

        let request = SigningRequest {
            trade_data: trade_data.to_vec(),
            encrypted_key_ciphertext: encrypted_key.to_vec(),
            policy_rules: load_current_policy_rules(),
        };

        write_frame(stream, &request).await?;
        let response: SigningResponse = read_frame(stream).await?;

        let elapsed = t0.elapsed();
        metrics::histogram!("signing_latency_us", elapsed.as_micros() as f64);

        Ok(response.signature)
    }

    async fn get_or_connect(&mut self) -> Result<&mut VsockStream, SigningError> {
        if self.stream.is_none() {
            let stream = VsockStream::connect_with_cid_port(ENCLAVE_CID, VSOCK_PORT)
                .map_err(|e| SigningError::ConnectionFailed(e.to_string()))?;

            // Set timeouts to bound latency in failure cases
            stream.set_read_timeout(Some(std::time::Duration::from_millis(500)))?;
            stream.set_write_timeout(Some(std::time::Duration::from_millis(500)))?;

            self.stream = Some(stream);
        }
        Ok(self.stream.as_mut().unwrap())
    }
}

Measured Latency Breakdown

On an isolated CPU core (see Determinism Under Load) on a c5n.2xlarge with the enclave running on the same physical instance:

Operation                              P50      P99      P99.9    Notes
──────────────────────────────────────────────────────────────────────────────
Policy validation (in enclave)         1µs      2µs      4µs      JSON rule eval
vsock round-trip (empty)              8µs      12µs     18µs     Local VM comm
KMS decrypt (first call, cold)        280µs    450µs    620µs    Network to KMS
KMS decrypt (subsequent, cached key)  31µs     47µs     63µs     Using session key
secp256k1 sign (32-byte digest)       3µs      5µs      8µs      Constant-time impl
Total (cold KMS)                      293µs    476µs    713µs
Total (warm KMS session)              43µs     66µs     93µs     Benchmark steady-state

The KMS cold vs warm latency difference (280µs vs 31µs) reflects AWS’s session key caching. After the first decrypt, KMS uses a data key cached in the enclave’s memory for subsequent operations, avoiding the full KMS API round-trip. This cache has a TTL (~5 minutes); after TTL expiry, the next decrypt pays the full latency.

The benchmark target of P99.9 < 100µs holds during warm operation. Our signing service architecture re-warms the KMS session proactively every 4 minutes (1 minute before TTL expiry) via a background keep-alive, keeping the data key fresh.

Deployment: Enclave Memory and CPU Allocation

# Start enclave (run on parent instance)
nitro-cli run-enclave \
    --enclave-cif signing-enclave.eif \
    --memory 512 \          # MB - must be contiguous, cannot be dynamically expanded
    --cpu-count 2 \         # Dedicated CPUs removed from parent's pool
    --enclave-cid 16        # CID for vsock communication

The enclave’s memory is reserved from the parent instance at startup and cannot be swapped. For a 512MB enclave on a c5n.2xlarge (8 vCPUs, 21GB RAM), 2 CPUs and 512MB are dedicated to the enclave and unavailable to the parent.

# Verify enclave is running and get measurements
nitro-cli describe-enclaves

# Output includes:
# "PCR0": "...", "PCR1": "...", "PCR2": "..."
# Verify these match the PCR values in your KMS key policy

How This Breaks in Production

Failure 1: PCR mismatch after code update without updating KMS policy. You deploy a new version of the signing service with a security fix. The new code has a different PCR0. KMS refuses to decrypt - your signing service is down. Every signing attempt fails. Fix: implement a blue-green enclave deployment. Before deploying the new enclave image, add the new PCR0 to the KMS key policy. Start the new enclave, verify it can successfully decrypt (smoke test), then remove the old PCR0. Downtime window: zero.

Failure 2: Enclave out-of-memory crash. The signing service accumulates connection state for each concurrent trade. At 1,000 concurrent signings, the enclave exceeds its 512MB memory allocation and crashes. The parent instance observes the enclave is gone; the vsock connection returns an error. Fix: monitor enclave memory usage via the parent’s nitro-cli describe-enclaves output. Add per-connection memory limits in the enclave code. Right-size the enclave memory allocation based on measured peak concurrent usage plus 50% headroom.

Failure 3: vsock connection not reused causing latency spikes. The parent creates a new vsock connection for every signing request. Each connection setup takes ~500µs. At 5,000 signs/second, connection setup overhead alone is 2.5 seconds per second of wall time - impossible. Fix: use persistent vsock connections (as shown in the code above). The vsock connection is a lightweight local channel; keeping it open is always correct. Implement reconnection logic for the case where the enclave restarts.

Failure 4: KMS session key expiring under bursty load. During a market event, signing requests spike to 10x normal rate. The background keep-alive thread is competing with signing threads for KMS access. The keep-alive misses its 4-minute window; the session key expires. The next signing request pays 280µs for KMS cold start. Under bursty load, multiple signing threads simultaneously try to warm the KMS session - causing a thundering herd. Fix: use a dedicated background goroutine/thread for KMS keep-alive that has higher priority than signing threads. Implement a single-flight lock so only one KMS warm-up is in flight at a time.

Failure 5: Attestation document replay attack. An attacker obtains an attestation document from a previous, legitimate enclave run and presents it to KMS to decrypt the key outside the enclave context. Fix: KMS’s attestation validation is built to prevent this. The attestation document includes the public key of an ephemeral key pair generated inside the enclave. KMS encrypts the response under this public key; only the enclave holds the corresponding private key, so only the enclave can read the response. The attestation document cannot be replayed.

Failure 6: Side-channel via timing on parent instance. The parent instance runs on a physical host shared with other tenants (unless you’re on Dedicated Hosts). A co-tenant who can monitor cache timing or power consumption could potentially infer information about signing operations. Fix: use Dedicated Hosts (--tenancy host) for parent instances that run signing enclaves. The cryptographic isolation of the enclave protects the key material; dedicated tenancy protects against cache-timing side channels on the non-enclave code path.


Related reading: Side-Channel Resistance and Constant-Time Crypto covers the ECDSA implementation requirements for the signing code inside the enclave. TEEs for Trading: Nitro, SGX, and SEV Compared covers the broader landscape. Determinism Under Load explains the full latency engineering behind the P99.9 < 100µs benchmark target.

Continue Reading

Enjoyed this?

Get one deep infrastructure insight per week.

Free forever. Unsubscribe anytime.

You're in. Check your inbox.