Skip to content

nikhilpadala.com / benchmarks

Benchmarks

KMS, CloudHSM, and kernel tuning numbers are directly measured. ZeroCopy Sentinel numbers are benchmark-suite results on AWS Nitro; production validation ongoing. Methodology in the section below.

Solution Comparison

The latency your keys add is the alpha you lose. This table shows median, P99, and P99.9 signing latency across common custody solutions.

Solution Median P99 P99.9 Source
AWS KMS ~130ms ~200ms ~350ms Measured
CloudHSM ~5ms ~12ms ~25ms Measured
Fireblocks ~300ms ~600ms ~1200ms Published API docs
ZeroCopy Sentinel Fastest 42µs 67µs 120µs Benchmark suite

3,095x

faster than AWS KMS

median: 42µs vs 130ms

119x

faster than CloudHSM

median: 42µs vs 5ms

7,143x

faster than Fireblocks

median: 42µs vs 300ms

KMS and CloudHSM numbers measured on c6i.4xlarge in us-east-1. Fireblocks numbers from their published API documentation. ZeroCopy numbers from our Nitro Enclave benchmark suite running on AWS c6i.4xlarge with PREEMPT_RT kernel patches.

Kernel Optimization Impact

Default Linux settings are optimized for throughput and power savings, not latency. These are the measured gains from targeted kernel tuning.

Setting Before After Delta
THP disabled

Transparent Huge Pages

15ms P99 spikes 50µs P99 300x
CPU isolation

isolcpus + nohz_full

~200µs jitter ~5µs jitter 40x
HugePages (2MB explicit)

TLB miss elimination

Variable TLB misses Deterministic Eliminates
Busy polling

SO_BUSY_POLL

~50µs syscall overhead ~5µs 10x

Measurements taken on bare-metal c6i.4xlarge with kernel 6.1 + PREEMPT_RT patches. THP spikes measured using perf stat and cyclictest. CPU isolation measured with cyclictest -l 100000 -t -n -p99.

How We Measure

All benchmarks run on AWS c6i.4xlarge instances in us-east-1. Each measurement is the median of 10,000 iterations after 1,000 warmup rounds. P99 and P99.9 are calculated from the full distribution using HDR Histogram. Kernel: 6.1 with PREEMPT_RT patches applied. CPU governor set to performance, C-states disabled in BIOS, THP disabled, and isolated CPUs pinned to dedicated cores.

KMS and CloudHSM latency was measured by timing the full round-trip from signing request submission to signature receipt in the calling process. Network latency to the service endpoint is included because that is what your application actually pays. Fireblocks numbers come from their published API documentation and developer reference. We do not have programmatic access to measure them directly.

ZeroCopy Sentinel latency is measured inside the Nitro Enclave using the benchmark suite on AWS c6i.4xlarge. The 42µs median includes ECDSA signing, attestation document generation, and response serialization. It does not include network transit from enclave to host; add ~5µs for vsock transport for wall-clock application latency. Production validation is ongoing; these are benchmark-suite results, not live-system metrics.

Get benchmark updates

When we run new numbers, you'll hear first. No fluff, no affiliate links.

Free forever. Unsubscribe anytime.

You're in. Check your inbox.