Infrastructure
Order Book Reconstruction at Scale: Snapshot + Diff vs Full Stream, and the Race Conditions That Break Both
Snapshot+diff book reconstruction has a race condition that creates phantom liquidity. The correct handling for 12 exchanges, Binance sequence numbers, and the memory layout that performs.
Maintaining correct order books across 12 exchanges simultaneously was one of the core responsibilities of the market data infrastructure I worked on at Akuna Capital. “Correct” here has a precise meaning: at any moment, our book should match what the exchange would show if you queried their REST API, with a lag of at most the time since the last WebSocket message. Crossed books (bid > ask), stale books (last update too old), and gap-afflicted books (missed a diff) all needed to be detected and corrected automatically, without human intervention, in under 100ms.
The naive implementation gets you 99.9% accuracy. The other 0.1% - the race conditions, the edge cases in sequence handling, the exchange-specific quirks - is what costs you money at 3 AM when markets are moving fast.
The Two Book Models
Every exchange offering real-time order book data uses one of two models:
Full snapshot stream: The exchange sends a complete snapshot of the order book every N milliseconds. Each snapshot is self-contained - you don’t need any previous state to interpret it. Binance’s @depth@100ms stream sends the top 20 price levels every 100ms. Simple to implement, but fundamentally 50ms stale on average.
Snapshot + incremental diff: The exchange sends a full snapshot once (via REST or at subscription time), then streams only the changes (diffs) continuously. Each diff contains only the price levels that changed and their new quantities. Latency is much lower (diffs arrive as they happen), but complexity is much higher - you must maintain the local book state and apply each diff correctly.
For any strategy that needs better than 100ms order book accuracy, snapshot+diff is mandatory.
The Race Condition in Snapshot+Diff
This is the most important thing to get right, and it’s documented clearly by Binance but misimplemented constantly.
The race condition: when you subscribe to the diff stream and then request the REST snapshot, diffs are already arriving from the WebSocket. The question is: do you apply the diffs that arrived before the snapshot, after the snapshot, or both?
The correct answer depends entirely on the sequence relationship between the diff and the snapshot.
Binance USDT-M Futures depth stream - the authoritative flow:
Each diff message has two fields:
U: First update ID in this diffu: Last update ID in this diff
The REST snapshot returns lastUpdateId.
The correct algorithm:
from sortedcontainers import SortedDict
from collections import deque
from dataclasses import dataclass, field
from typing import Optional, Dict
import asyncio
import aiohttp
import time
@dataclass
class PriceLevel:
price: float
quantity: float
class OrderBook:
"""
Binance USDT-M Futures order book with correct snapshot+diff handling.
"""
def __init__(self, symbol: str):
self.symbol = symbol
self.bids: SortedDict = SortedDict(lambda x: -x) # Descending order
self.asks: SortedDict = SortedDict() # Ascending order
self.last_update_id: int = 0
self.is_synchronized: bool = False
self._diff_buffer: deque = deque()
self._last_update_time: float = 0.0
def buffer_diff(self, diff: dict) -> None:
"""
Buffer a diff message. Called from WebSocket handler.
Even before synchronization, we buffer - do not discard.
"""
self._diff_buffer.append(diff)
async def synchronize(self, session: aiohttp.ClientSession) -> bool:
"""
Fetch REST snapshot and establish synchronized state.
Returns True on success, False if synchronization failed.
"""
url = f"https://fapi.binance.com/fapi/v1/depth?symbol={self.symbol}&limit=1000"
async with session.get(url) as resp:
snapshot = await resp.json()
snapshot_last_id = snapshot['lastUpdateId']
# Step 1: Discard diffs entirely before the snapshot
while self._diff_buffer:
diff = self._diff_buffer[0]
if diff['u'] <= snapshot_last_id:
self._diff_buffer.popleft()
else:
break
# Step 2: Find first valid diff - must satisfy U <= lastUpdateId+1 <= u
if self._diff_buffer:
first_diff = self._diff_buffer[0]
U = first_diff['U']
u = first_diff['u']
if not (U <= snapshot_last_id + 1 <= u):
# The snapshot and stream don't align - buffer overflowed or we were too slow
# Retry the entire synchronization
self._diff_buffer.clear()
return False
# Step 3: Apply snapshot
self.bids.clear()
self.asks.clear()
for price_str, qty_str in snapshot['bids']:
price, qty = float(price_str), float(qty_str)
if qty > 0:
self.bids[price] = qty
for price_str, qty_str in snapshot['asks']:
price, qty = float(price_str), float(qty_str)
if qty > 0:
self.asks[price] = qty
self.last_update_id = snapshot_last_id
# Step 4: Apply buffered diffs
while self._diff_buffer:
diff = self._diff_buffer.popleft()
self._apply_diff(diff)
self.is_synchronized = True
self._last_update_time = time.perf_counter()
return True
def apply_stream_diff(self, diff: dict) -> None:
"""
Called from WebSocket handler after synchronization.
"""
if not self.is_synchronized:
self.buffer_diff(diff)
return
# Sequence continuity check
U = diff['U']
if U != self.last_update_id + 1:
# Gap detected
self.is_synchronized = False
self._diff_buffer.clear()
self._diff_buffer.append(diff)
# Caller must detect is_synchronized=False and re-synchronize
return
self._apply_diff(diff)
def _apply_diff(self, diff: dict) -> None:
"""Apply a diff to local book state."""
for price_str, qty_str in diff.get('b', []):
price, qty = float(price_str), float(qty_str)
if qty == 0:
self.bids.pop(price, None)
else:
self.bids[price] = qty
for price_str, qty_str in diff.get('a', []):
price, qty = float(price_str), float(qty_str)
if qty == 0:
self.asks.pop(price, None)
else:
self.asks[price] = qty
self.last_update_id = diff['u']
self._last_update_time = time.perf_counter()
def best_bid(self) -> Optional[tuple[float, float]]:
if not self.bids:
return None
price = self.bids.peekitem(0)[0]
return (price, self.bids[price])
def best_ask(self) -> Optional[tuple[float, float]]:
if not self.asks:
return None
price = self.asks.peekitem(0)[0]
return (price, self.asks[price])
def is_crossed(self) -> bool:
bid = self.best_bid()
ask = self.best_ask()
if bid is None or ask is None:
return False
return bid[0] >= ask[0]
def staleness_ms(self) -> float:
if self._last_update_time == 0:
return float('inf')
return (time.perf_counter() - self._last_update_time) * 1000
The Memory Model: Sorted Map vs Flat Array
The SortedDict in the example above (from sortedcontainers library) is a pure-Python sorted dictionary. For a production system, you need to understand the memory and performance tradeoffs.
Sorted map (SortedDict, std::BTreeMap in Rust, TreeMap in Java):
- Insert/delete/lookup: O(log n) where n is the number of price levels
- Best bid/ask: O(1) via peek
- Memory: ~48 bytes per entry in Python (key + value + tree node overhead)
- For a 1000-level book: ~48 KB
- Iteration (for depth rendering): O(k) for top k levels
Flat array with binary search:
- The book is kept sorted by maintaining a fixed-size array of (price, qty) pairs
- Insert: O(n) for shift, but n is small (typical books have 20-200 active levels)
- Lookup: O(log n) via binary search
- Memory: 16 bytes per entry (two doubles), dense layout = better cache performance
- For a 1000-level book: ~16 KB with better cache line utilization
For Python, use sortedcontainers.SortedDict. It’s implemented in pure Python but uses efficient list-based internals that outperform most alternatives in this use case.
For C++ or Rust, a flat array with insertion sort outperforms a tree map for books with fewer than ~500 levels, because cache locality dominates the algorithmic advantage of O(log n) insert.
Crossed Book Detection
A crossed book (best bid >= best ask) is always an error. It can occur from:
- Applying a diff against a wrong base state (stale book)
- A sequence gap that wasn’t detected (missed a cancel or modify)
- An exchange-side glitch (rare but real - Binance has had crossed-book events during extreme volatility)
- Your own logic bug (wrong sign on qty delta, float precision error)
The correct response to a crossed book depends on the cause:
async def book_health_monitor(
book: OrderBook,
resync_fn: callable,
logger,
) -> None:
"""Background task that monitors book health and triggers resync."""
while True:
await asyncio.sleep(0.01) # Check every 10ms
if not book.is_synchronized:
logger.warning("Book desynchronized, resyncing %s", book.symbol)
await resync_fn(book.symbol)
continue
if book.is_crossed():
logger.error(
"Crossed book on %s: bid=%.2f ask=%.2f - resyncing",
book.symbol,
book.best_bid()[0] if book.best_bid() else 0,
book.best_ask()[0] if book.best_ask() else 0,
)
book.is_synchronized = False
await resync_fn(book.symbol)
continue
staleness = book.staleness_ms()
if staleness > 2000: # More than 2s since last update
logger.warning(
"Stale book on %s: last update %dms ago",
book.symbol,
staleness,
)
# Don't immediately resync - the exchange might just be quiet
# But flag for strategy layer to avoid trading on stale data
Do not trade on a crossed or stale book. This sounds obvious but requires explicit enforcement. The strategy should check book.is_synchronized and book.staleness_ms() < threshold before using book data for any decision.
Exchange-Specific Sequence Schemes
Each exchange implements sequence numbers differently. Binance USDT-M Futures is documented above. Other variations:
Binance Spot:
# Spot uses the same U/u fields as futures
# But the diff includes 'e' (event type), 'E' (event time), 's' (symbol)
# Snapshot endpoint: GET /api/v3/depth?symbol=BTCUSDT&limit=5000
# lastUpdateId from snapshot, U/u from diffs - same logic as futures
Binance USDC-M Futures:
# Uses 'pu' (previous update ID) instead of 'U'
# Continuity check: diff.pu == last_update_id
# First valid diff after snapshot: diff.pu == snapshot.lastUpdateId
OKX:
# OKX sends explicit action: "snapshot" or "update"
# On subscribe, first message has action="snapshot" - full book
# Subsequent messages have action="update" - diffs
# Uses seqId/prevSeqId for continuity check
# No need to separately fetch REST snapshot - WebSocket delivers it
Bybit V5 Linear:
# First message after subscribe has "type": "snapshot" - full book
# Subsequent messages have "type": "delta" - diffs
# Uses 'seq' field for continuity check (integer, sequential)
# Snapshot and delta arrive on same WebSocket, eliminating the race condition
Bybit’s design is worth noting: by delivering the snapshot via WebSocket (not REST), they eliminate the race condition entirely. There’s no window between snapshot request and diff arrival. This is strictly better than Binance’s model from an implementation correctness standpoint.
Managing 12 Books Simultaneously
At Akuna, maintaining 12 books simultaneously required careful resource management:
Book update fan-out: Each WebSocket message routes to the correct book by symbol. The hot path is: receive bytes → parse JSON → route to book → apply diff. This path runs on the asyncio event loop and must never block.
Resync coordination: When multiple books need resync simultaneously (e.g., after a network partition), the resync requests must be serialized to avoid hitting REST rate limits. Use a semaphore:
import asyncio
resync_semaphore = asyncio.Semaphore(3) # Max 3 concurrent resyncs
async def resync_book(symbol: str, session: aiohttp.ClientSession) -> None:
async with resync_semaphore:
book = books[symbol]
for attempt in range(3):
success = await book.synchronize(session)
if success:
return
await asyncio.sleep(0.5 * (attempt + 1))
raise RuntimeError(f"Failed to resync {symbol} after 3 attempts")
Memory budget: 12 exchanges × average 50 symbols per exchange × 1000 levels × 16 bytes per level = ~9.6 MB for the raw book data. Python overhead multiplies this by 3-5x. Budget ~50 MB for full book state in Python, ~10 MB in C++ or Rust.
How This Breaks in Production
1. Buffer overflow during slow snapshot fetch Symptom: Synchronization always fails - the “first valid diff” check never passes. Loop of sync attempts. Root cause: Your diff buffer has maxlen=1000, but the REST snapshot takes 800ms to arrive during peak load. In that time, 2000+ diffs arrive and overflow the buffer. When the snapshot arrives, the buffer contains only recent diffs, and the early diffs that bridge snapshot to stream are gone. Fix: Increase buffer size to 50,000+ messages. Snapshot fetch should rarely take more than 200ms, but you need buffer headroom for load spikes.
2. Applying OKX diffs before the initial snapshot
Symptom: OKX book is wrong immediately after subscription. Best bid/ask look plausible but don’t match exchange UI.
Root cause: OKX sends an initial “snapshot” via WebSocket, but also starts sending “update” diffs immediately. If you start applying updates before processing the initial snapshot message, you’re applying diffs against an empty book.
Fix: Buffer all messages until you receive a message with action="snapshot". Apply the snapshot, then apply buffered updates, then enter normal operation.
3. Float precision error creates phantom crossed book
Symptom: Book health monitor logs crossed book every few seconds on high-price assets (like BTC), triggers unnecessary resyncs.
Root cause: String-to-float conversion of price levels introduces floating-point representation error. “43521.10” parsed as float might be 43521.09999999999 or 43521.10000000001. When comparing bid and ask at adjacent price levels, the comparison bid >= ask fires spuriously.
Fix: Use Python’s Decimal or store prices as integers (scaled by 1e8 or the exchange’s minimum tick size). Never use raw float comparison for equality or cross checks.
4. Resync storm on network event Symptom: After a brief network disruption affecting the office/datacenter, all 12×50 = 600 books simultaneously trigger resync. REST rate limits are immediately exhausted. Recovery takes 10+ minutes. Root cause: Network event causes all WebSocket connections to drop simultaneously. All books become desynchronized. Resync logic immediately attempts to refetch 600 REST snapshots. Fix: Implement exponential backoff per symbol with jitter. Rate-limit total concurrent resyncs. Use the exchange’s WebSocket-based snapshot (like Bybit and OKX provide) where available, to avoid REST rate limit pressure.
5. Sequence gap silent handling - book never corrects
Symptom: Book gradually diverges from exchange state over hours. Queries to exchange REST occasionally show different levels than your local book. No errors logged.
Root cause: A sequence gap occurred but apply_stream_diff silently accepted the out-of-sequence message (perhaps by checking U > last_update_id instead of U == last_update_id + 1). The book continues updating from the gap point without re-syncing, accumulating drift.
Fix: On any sequence gap, immediately mark the book unsynchronized and trigger resync. Never silently accept an out-of-sequence diff.
6. Stale book used during volatile market after exchange maintenance
Symptom: Strategy places orders based on a price that’s $500 away from current market. Large loss.
Root cause: Exchange maintenance window closed the WebSocket connection. Book wasn’t updated for 15 minutes but is_synchronized remained True because the stale detection only checks time since last update, and the last update timestamp was only cleared on explicit desync.
Fix: On WebSocket disconnect, immediately set book.is_synchronized = False. Do not wait for staleness threshold. Any disconnect = requires resync.
For the WebSocket infrastructure that delivers these diffs, see WebSocket at HFT Scale. For Binance-specific sequence number details, see Binance Connectivity Deep Dive. For the exchange-specific sequence schemes at OKX, Bybit, and Deribit, see OKX, Bybit, and Deribit API Guide. For multicast-based order book delivery in institutional markets, see Multicast Market Data.
Continue Reading
Enjoyed this?
Get one deep infrastructure insight per week.
Free forever. Unsubscribe anytime.
You're in. Check your inbox.