The Anatomy of a Sub-50µs Trade: Tracing a Packet from NIC to Strategy and Back
A packet-level walkthrough of a sub-50µs trade at Akuna Capital: NIC ring buffer, kernel bypass, strategy evaluation, order encoding, and wire transmit.
Sub-100µs from NIC to strategy decision
The kernel layer beneath every HFT trade: NUMA topology, CPU isolation, kernel bypass networking, huge pages, and the RT scheduling configuration that separates 18µs from 200µs.
Every institutional desk runs Linux bare-metal. The performance delta between a tuned and untuned system is not 10%: it is a full order of magnitude. These ten posts document the specific kernel techniques that close that gap, grounded in production experience at Akuna Capital and Gemini.
A packet-level walkthrough of a sub-50µs trade at Akuna Capital: NIC ring buffer, kernel bypass, strategy evaluation, order encoding, and wire transmit.
P99 doubled from 22µs to 41µs overnight with no code changes. A background analytics job pushed the trading engine off NUMA node 0. Full numastat/perf c2c diagnosis workflow.
How to build a genuinely quiet CPU core for HFT using isolcpus, nohz_full, rcu_nocbs, and proper IRQ migration - with the grub cmdline that actually works.
ef_vi vs DPDK vs AF_XDP on identical hardware: ef_vi P99 was 31ns above median; DPDK was 87ns. When the variance gap, not the median, should drive the kernel bypass decision.
How a THP compaction stall caused a 400µs latency spike mid-session, plus the correct way to configure static huge pages for trading systems in production.
How irqbalance moved an RX queue IRQ to the trading core mid-session, what MSI-X actually is, and how to correctly configure per-queue interrupt affinity for HFT.
Debugging a P99 latency spike at Akuna Capital using perf record, Brendan Gregg flame graphs, eBPF offcpu analysis, and the critical difference between on-CPU and off-CPU profiles.
Replacing a mutex queue with a lock-free SPSC ring buffer dropped P99 from 50µs to 4µs at Akuna. Correct C++17 implementation with cache-line alignment and false-sharing failure mode.
SCHED_FIFO for HFT, priority inversion from the Mars Pathfinder to trading latency, priority inheritance mutexes, and the near-miss kernel lockup from a misconfigured RT process.
How a Spectre mitigation patch silently added 15% latency regression, what resets your tuning without warning, and how to govern a trading server against configuration drift.