nikhilpadala.com / benchmarks
Benchmarks
KMS, CloudHSM, and kernel tuning numbers are directly measured. ZeroCopy Sentinel numbers are benchmark-suite results on AWS Nitro; production validation ongoing. Methodology in the section below.
Solution Comparison
The latency your keys add is the alpha you lose. This table shows median, P99, and P99.9 signing latency across common custody solutions.
| Solution | Median | P99 | P99.9 | Source |
|---|---|---|---|---|
| AWS KMS | ~130ms | ~200ms | ~350ms | Measured |
| CloudHSM | ~5ms | ~12ms | ~25ms | Measured |
| Fireblocks | ~300ms | ~600ms | ~1200ms | Published API docs |
| ZeroCopy Sentinel Fastest | 42µs | 67µs | 120µs | Benchmark suite |
3,095x
faster than AWS KMS
median: 42µs vs 130ms
119x
faster than CloudHSM
median: 42µs vs 5ms
7,143x
faster than Fireblocks
median: 42µs vs 300ms
KMS and CloudHSM numbers measured on c6i.4xlarge in us-east-1. Fireblocks numbers from their published API documentation. ZeroCopy numbers from our Nitro Enclave benchmark suite running on AWS c6i.4xlarge with PREEMPT_RT kernel patches.
Kernel Optimization Impact
Default Linux settings are optimized for throughput and power savings, not latency. These are the measured gains from targeted kernel tuning.
| Setting | Before | After | Delta |
|---|---|---|---|
| THP disabled Transparent Huge Pages | 15ms P99 spikes | 50µs P99 | 300x |
| CPU isolation isolcpus + nohz_full | ~200µs jitter | ~5µs jitter | 40x |
| HugePages (2MB explicit) TLB miss elimination | Variable TLB misses | Deterministic | Eliminates |
| Busy polling SO_BUSY_POLL | ~50µs syscall overhead | ~5µs | 10x |
Measurements taken on bare-metal c6i.4xlarge with kernel 6.1 + PREEMPT_RT patches.
THP spikes measured using perf stat and cyclictest.
CPU isolation measured with cyclictest -l 100000 -t -n -p99.
How We Measure
All benchmarks run on AWS c6i.4xlarge instances in us-east-1. Each measurement is the median of 10,000 iterations after 1,000 warmup rounds. P99 and P99.9 are calculated from the full distribution using HDR Histogram. Kernel: 6.1 with PREEMPT_RT patches applied. CPU governor set to performance, C-states disabled in BIOS, THP disabled, and isolated CPUs pinned to dedicated cores.
KMS and CloudHSM latency was measured by timing the full round-trip from signing request submission to signature receipt in the calling process. Network latency to the service endpoint is included because that is what your application actually pays. Fireblocks numbers come from their published API documentation and developer reference. We do not have programmatic access to measure them directly.
ZeroCopy Sentinel latency is measured inside the Nitro Enclave using the benchmark suite on AWS c6i.4xlarge. The 42µs median includes ECDSA signing, attestation document generation, and response serialization. It does not include network transit from enclave to host; add ~5µs for vsock transport for wall-clock application latency. Production validation is ongoing; these are benchmark-suite results, not live-system metrics.
Get benchmark updates
When we run new numbers, you'll hear first. No fluff, no affiliate links.
Free forever. Unsubscribe anytime.
You're in. Check your inbox.