Infrastructure
Building a Smart Order Router for Fragmented Crypto Liquidity
SOR for fragmented crypto liquidity: fee-adjusted mid routing, maker/taker status, rate limit headroom, and the position drift and hedge imbalance failures that break naive implementations.
At Akuna Capital, we maintained connections to 12 exchanges simultaneously. Not because we needed 12 venues for every trade, but because at any given moment, the best execution for a given order might be on any of them. The smart order router (SOR) was the component that decided: given this symbol, this side, this size, and this urgency - which venue, what order type, and what price?
Getting SOR design right is one of the highest-leverage engineering decisions in a crypto trading system. A poor SOR costs you on every single order - worse fills, higher fees, or missed opportunities. A good SOR compounds over time. At 5,000/day.
The SOR Decision Tree
The SOR is not a black box. Every routing decision should be explainable and deterministic for a given market snapshot. Here is the decision tree I use:
1. Is this symbol available on multiple venues?
No → Use only available venue
2. For each available venue, compute: fee-adjusted effective price
(bid or ask adjusted for taker fee, or offer adjusted for rebate)
3. Filter venues with insufficient liquidity at target price
(available quantity at limit price < order quantity)
4. Filter venues currently at or near rate limit
(weight used > 80% of limit, or order count > 85% of limit)
5. Among remaining venues: rank by fee-adjusted price
Best price wins; tie-break on venue reliability score
6. For large orders (> venue_depth_threshold): split across venues
Allocate proportional to available liquidity at each level
7. Check if taker vs maker matters for this order
If urgency=HIGH: taker acceptable
If urgency=LOW: restrict to venues where we can post maker
8. Execute
Fee-Adjusted Effective Price
The nominal best price on multiple venues is not the execution quality metric. The fee-adjusted effective price is.
For a taker buy order:
Effective price = ask_price × (1 + taker_fee_rate)
For a maker buy order:
Effective price = bid_price × (1 - maker_rebate_rate)
Fee structure by venue (approximate 2024 values, verify current rates):
Venue Maker fee Taker fee Notes
------- ---------- --------- -----
Binance -0.02% 0.05% Rebate for makers (negative = rebate)
OKX -0.02% 0.05% Similar to Binance
Bybit -0.01% 0.06% Lower rebate, higher taker
Deribit -0.01% 0.03% Options: different model (see below)
Kraken 0.02% 0.05% No maker rebate
Concrete example: BTC at $43,500. Buy order for 0.1 BTC.
Venue A ask: $43,500 (taker fee: 0.05%)
Effective price = $43,500 × 1.0005 = $43,521.75
Venue B ask: $43,502 (taker fee: 0.02%)
Effective price = $43,502 × 1.0002 = $43,510.70
→ Route to Venue B despite worse nominal price
Net saving: $11.05 per fill
At $10M/day notional: ~$2,500/day from fee-adjusted routing alone
In Python pseudocode:
from dataclasses import dataclass
from typing import Optional
@dataclass
class VenueFeeSchedule:
maker_rebate: float # Negative = you receive rebate
taker_fee: float
@dataclass
class VenueQuote:
venue: str
bid: float
bid_qty: float
ask: float
ask_qty: float
fee: VenueFeeSchedule
rate_limit_headroom: float # 0.0-1.0, fraction of limit remaining
def fee_adjusted_price(
nominal_price: float,
is_buy: bool,
is_taker: bool,
fee: VenueFeeSchedule,
) -> float:
"""Lower is better for buys. Higher is better for sells."""
if is_taker:
fee_rate = fee.taker_fee
else:
fee_rate = fee.maker_rebate # Negative means cost is negative = rebate
if is_buy:
return nominal_price * (1 + fee_rate)
else:
return nominal_price * (1 - fee_rate)
def select_venue_for_taker_buy(
quotes: list[VenueQuote],
quantity: float,
min_rate_limit_headroom: float = 0.15,
) -> Optional[str]:
"""
Select best venue for a taker buy order.
Returns venue name, or None if no suitable venue available.
"""
candidates = []
for q in quotes:
# Filter: insufficient liquidity
if q.ask_qty < quantity:
continue
# Filter: rate limit too low
if q.rate_limit_headroom < min_rate_limit_headroom:
continue
eff_price = fee_adjusted_price(
nominal_price=q.ask,
is_buy=True,
is_taker=True,
fee=q.fee,
)
candidates.append((eff_price, q.venue, q))
if not candidates:
return None
# Sort by effective price (ascending for buys)
candidates.sort(key=lambda x: x[0])
return candidates[0][1]
Liquidity Slippage: When to Split Across Venues
For orders larger than the top-of-book liquidity, you need to decide: fill the remainder on the same venue at worse prices, or split across venues.
The slippage curve tells you the effective average price as a function of fill size:
def compute_slippage_curve(
order_book: dict, # {"bids": [[price, qty], ...], "asks": [[price, qty], ...]}
side: str,
max_qty: float,
) -> list[tuple[float, float]]:
"""
Returns list of (cumulative_qty, avg_price) pairs.
Shows effective average fill price as order size grows.
"""
levels = order_book['asks'] if side == 'buy' else order_book['bids']
curve = []
cumulative_qty = 0.0
cumulative_notional = 0.0
for price, qty in levels:
price, qty = float(price), float(qty)
fill_qty = min(qty, max_qty - cumulative_qty)
cumulative_qty += fill_qty
cumulative_notional += fill_qty * price
avg_price = cumulative_notional / cumulative_qty
curve.append((cumulative_qty, avg_price))
if cumulative_qty >= max_qty:
break
return curve
def should_split_order(
quotes: list[VenueQuote],
order_qty: float,
side: str,
max_acceptable_slippage_bps: float = 2.0,
) -> bool:
"""
Determines if splitting across venues reduces slippage enough to justify
the added complexity.
"""
best_venue = quotes[0]
single_venue_book = best_venue.ask if side == 'buy' else best_venue.bid
# Simplified: check if we can fill the full order at top-of-book
available_at_top = best_venue.ask_qty if side == 'buy' else best_venue.bid_qty
if available_at_top >= order_qty:
return False # No split needed, full fill at top-of-book price
# Compute expected slippage on single venue
# (requires order book data, not just top of book)
# This is where you need the full depth from order book reconstruction
return True # Simplified: split whenever we exceed top-of-book quantity
The splitting decision involves a real tradeoff: splitting reduces slippage but doubles the number of orders, which:
- Uses twice the rate limit weight
- Requires reconciling two fills instead of one
- Creates partial fill risk (one venue fills, other doesn’t)
- Introduces race conditions (market moves between the two order placements)
For most retail crypto strategies, splitting makes sense only when the order size exceeds $100K+ notional and the price impact of single-venue execution is measurable.
What Breaks When One Venue Degrades
The SOR must handle venue degradation gracefully. Degradation modes:
Rate limit approaching: Route around the venue. Remove it from the candidate set when rate_limit_headroom < 15%. Re-enable when headroom recovers.
Elevated latency: If order placement RTT for a venue exceeds 3× its normal baseline, route around it. Elevated latency means orders are arriving late and fills are happening at worse prices than intended.
WebSocket lag (sequence gap): When the order book for a venue is desynchronized (see Order Book Reconstruction at Scale), the SOR is operating on a stale book. Route around venues with stale books or increase the minimum acceptable liquidity threshold to account for uncertainty.
Venue downtime: The hardest case. If a venue is down and you have open orders on it:
- You can’t cancel those orders (venue is down)
- You don’t know if they’re still working, partially filled, or cancelled
- Your position may be uncertain
The correct response: mark all outstanding orders on the degraded venue as “uncertain”, route all new orders to other venues, and reconcile position when the venue recovers. Do not assume orders were cancelled just because you can’t reach the venue - they may fill when connectivity is restored.
class VenueStatus:
HEALTHY = "healthy"
DEGRADED_LATENCY = "degraded_latency"
DEGRADED_RATE_LIMIT = "degraded_rate_limit"
STALE_BOOK = "stale_book"
DISCONNECTED = "disconnected"
class SORVenueManager:
def __init__(self, venues: list[str]):
self._status: dict[str, str] = {v: VenueStatus.HEALTHY for v in venues}
self._rate_limit_headroom: dict[str, float] = {v: 1.0 for v in venues}
self._last_order_latency_ms: dict[str, float] = {v: 0.0 for v in venues}
self._baseline_latency_ms: dict[str, float] = {}
def get_eligible_venues(self) -> list[str]:
"""Return venues eligible for order routing."""
return [
v for v, status in self._status.items()
if status in (VenueStatus.HEALTHY,)
]
def update_order_latency(self, venue: str, latency_ms: float) -> None:
self._last_order_latency_ms[venue] = latency_ms
if venue not in self._baseline_latency_ms:
self._baseline_latency_ms[venue] = latency_ms
return
baseline = self._baseline_latency_ms[venue]
if latency_ms > baseline * 3:
self._status[venue] = VenueStatus.DEGRADED_LATENCY
else:
# Exponential moving average update for baseline
self._baseline_latency_ms[venue] = 0.95 * baseline + 0.05 * latency_ms
def update_rate_limit(self, venue: str, headroom: float) -> None:
self._rate_limit_headroom[venue] = headroom
if headroom < 0.15:
self._status[venue] = VenueStatus.DEGRADED_RATE_LIMIT
elif self._status[venue] == VenueStatus.DEGRADED_RATE_LIMIT and headroom > 0.3:
self._status[venue] = VenueStatus.HEALTHY
def mark_book_stale(self, venue: str) -> None:
self._status[venue] = VenueStatus.STALE_BOOK
def mark_book_synchronized(self, venue: str) -> None:
if self._status[venue] == VenueStatus.STALE_BOOK:
self._status[venue] = VenueStatus.HEALTHY
The Latency Budget for SOR Decisions
The SOR adds latency to every order. The question is: how much is acceptable?
For co-located strategies (1-5µs to exchange):
- SOR decision must complete in < 1µs
- Requires C++ or Rust implementation
- Algorithmic complexity must be O(n) or lower on number of venues
- All market data must be in L1/L2 cache
For cloud-based strategies (10-200ms to exchange):
- SOR decision can take up to 100µs
- Python is acceptable for the routing logic
- Market data doesn’t need to be cache-resident
For Dubai-to-Asia routing (90-175ms to exchange):
- SOR can take up to 1ms - a rounding error relative to network RTT
- Python implementation with full order book access is fine
My implementation: Python with asyncio, all order book data in memory, SOR decision takes ~50µs. At 175ms network RTT to Binance, this is < 0.03% overhead - negligible.
import time
from typing import Optional
async def route_order(
symbol: str,
side: str,
quantity: float,
urgency: str, # "high" or "low"
order_books: dict, # symbol -> exchange -> book
venue_manager: SORVenueManager,
fee_schedules: dict[str, VenueFeeSchedule],
) -> Optional[dict]:
"""
Full SOR routing decision. Returns order parameters for the selected venue.
"""
t0 = time.perf_counter_ns()
eligible_venues = venue_manager.get_eligible_venues()
if not eligible_venues:
return None
books_for_symbol = order_books.get(symbol, {})
candidates = []
for venue in eligible_venues:
book = books_for_symbol.get(venue)
if book is None or not book.is_synchronized:
continue
if side == 'buy':
nominal_price = book.best_ask()[0] if book.best_ask() else None
available_qty = book.best_ask()[1] if book.best_ask() else 0
else:
nominal_price = book.best_bid()[0] if book.best_bid() else None
available_qty = book.best_bid()[1] if book.best_bid() else 0
if nominal_price is None:
continue
is_taker = (urgency == "high" or
(side == 'buy' and nominal_price <= book.best_ask()[0]) or
(side == 'sell' and nominal_price >= book.best_bid()[0]))
eff_price = fee_adjusted_price(
nominal_price=nominal_price,
is_buy=(side == 'buy'),
is_taker=is_taker,
fee=fee_schedules[venue],
)
candidates.append({
'venue': venue,
'nominal_price': nominal_price,
'eff_price': eff_price,
'available_qty': available_qty,
'is_taker': is_taker,
})
if not candidates:
return None
# Sort by effective price (ascending for buy, descending for sell)
reverse = (side == 'sell')
candidates.sort(key=lambda c: c['eff_price'], reverse=reverse)
best = candidates[0]
decision_latency_us = (time.perf_counter_ns() - t0) / 1000
# Log decision latency for monitoring - alert if > 500µs
return {
'venue': best['venue'],
'symbol': symbol,
'side': side,
'quantity': min(quantity, best['available_qty']),
'price': best['nominal_price'],
'order_type': 'MARKET' if best['is_taker'] and urgency == 'high' else 'LIMIT',
'time_in_force': 'IOC' if best['is_taker'] else 'GTC',
'decision_latency_us': decision_latency_us,
}
How This Breaks in Production
1. Position drift when one venue fills and another doesn’t Symptom: Strategy’s net position doesn’t match expected. Reconciliation shows fills on Venue A but no fills on Venue B despite both orders being placed. Root cause: Split order - half on Venue A, half on Venue B. Venue B filled but the fill notification was delayed (WebSocket lag). Strategy placed a second order before receiving the Venue B fill, resulting in double the intended size. Fix: Track pending fills per venue explicitly. Don’t place follow-up orders until all pending fills from the current order batch are confirmed or timed out.
2. Hedge imbalance from stale routing decision Symptom: Delta-neutral strategy shows net delta exposure. One leg of a hedge filled, the other didn’t. Root cause: SOR selected Venue B for the hedge leg based on order book state from 200ms ago. By the time the order arrived, Venue B’s liquidity had moved. The hedge order filled partially or not at all. Fix: For latency-sensitive hedges, use IOC orders rather than GTC. An IOC that doesn’t fill in full is a clear signal to immediately re-hedge. A GTC that partially fills and sits in the book creates a longer-lived position discrepancy.
3. Rate limit headroom calculation wrong - venue over-routed
Symptom: After 5 minutes of trading, all orders start failing with “429 too many requests” on a specific venue.
Root cause: Rate limit headroom was computed from the last REST response’s X-MBX-USED-WEIGHT-1M header, but multiple concurrent orders were in flight. The actual consumed weight was higher than the last observed value.
Fix: Track in-flight requests separately. Add estimated weight of pending requests to observed weight before computing headroom. Err conservative - if headroom is uncertain, route to the next best venue.
4. Fee-adjusted routing ignores maker/taker transition Symptom: During fast markets, SOR is routing to Venue A because of lower taker fees. But the strategy is actually posting maker orders (the price isn’t crossing). Venue B has better maker rebate. Root cause: SOR computes effective price using taker fee for all orders, even when the strategy intends to rest in the book. Fix: Pass urgency/intended order type to fee_adjusted_price. Use maker_rebate for intended maker orders, taker_fee for intended taker orders.
5. Venue down → all orders concentrated on remaining venues → rate limits Symptom: When one of three venues goes down, the other two hit rate limits within seconds. Orders start failing. Root cause: Order flow that was spread across three venues is now concentrated on two. Each venue’s rate limit was sized for 1/3 of the total order flow. At 1/2 each, limits are exceeded. Fix: Implement order flow shaping. When venue count drops from 3 to 2, reduce total order rate by 33% or raise risk thresholds to trade fewer symbols until capacity is restored.
6. SOR decision latency spikes under order book resync
Symptom: Order placement latency (measured from signal to send) spikes to 5-10ms intermittently, correlating with order book resync events.
Root cause: When an order book is resyncing, the SOR iterates over venues and calls book.is_synchronized, which acquires a lock. The resync process holds this lock while fetching the REST snapshot, blocking SOR decisions for 200-800ms.
Fix: Use a non-blocking read for is_synchronized - a simple boolean flag without locking. Accept the possibility of routing to a briefly stale book (better than stalling all routing). Implement lock-free book status tracking.
For the order book data that feeds this SOR, see Order Book Reconstruction at Scale. For the exchange-specific quirks that affect routing decisions, see Binance Connectivity Deep Dive and OKX, Bybit, and Deribit API Guide. For the venue geography that determines routing fallbacks, see Exchange Co-Location in the Cloud Era. For the fee and rebate mechanics that drive fee-adjusted pricing, see Rebate Capture and Maker-Taker Dynamics.
Continue Reading
Enjoyed this?
Get one deep infrastructure insight per week.
Free forever. Unsubscribe anytime.
You're in. Check your inbox.