Skip to content

Infrastructure

Multicast Market Data: Source-Specific Multicast, A/B Feed Arbitration, and Gap Detection

How institutional market data infrastructure works with IGMPv3 SSM, A/B feed redundancy, and sequence-based gap detection. And why crypto is converging to this model.

12 min
#multicast #market-data #networking #hft #igmp #infrastructure

At Gemini, we ran PTP clock synchronization with Solarflare NICs capable of hardware timestamping at sub-100ns resolution. Part of why that precision mattered was the downstream infrastructure: our market data feeds used multicast delivery, and the timestamps on each packet were load-bearing for determining which A/B feed packet arrived first. Understanding multicast market data isn’t just academic - it’s the architecture that institutional equity and futures markets have used for two decades, and it’s the model that high-performance crypto co-location venues are converging toward.

This post covers how multicast market data works, why it’s architecturally superior to unicast WebSocket feeds, and what changes in your infrastructure when you operate against it.

Why Multicast for Market Data?

The fundamental problem with unicast delivery of market data: if you have 500 subscribers to BTCUSD level-2 updates, the exchange must maintain 500 TCP connections and send each update 500 times. At 10,000 updates per second, that’s 5 million messages per second, 500 TCP sessions to manage, and the exchange’s network egress costs scale linearly with subscriber count.

With multicast, the exchange sends each update exactly once. The network infrastructure (switches, routers) replicates the packet to all subscribers at the network layer. The exchange sends 10,000 messages per second regardless of whether it has 10 subscribers or 10,000.

For subscribers, multicast means zero connection state. You don’t maintain a TCP session with the exchange. You join a multicast group, and the network delivers packets to your NIC’s receive buffer. No handshake, no reconnection logic, no keepalive heartbeats.

The tradeoff: multicast is UDP. There’s no retransmission. A dropped packet is gone. The entire reliability model shifts to the application layer - which is why gap detection and A/B feed redundancy exist.

IGMPv3 and Source-Specific Multicast

IGMP (Internet Group Management Protocol) is how your host tells its local router that it wants to receive packets destined for a particular multicast group address. Version 3 introduced Source-Specific Multicast (SSM).

Any-Source Multicast (ASM, IGMPv1/v2):

Join group 239.1.2.3
→ Receive all packets sent to 239.1.2.3 from any source

Source-Specific Multicast (SSM, IGMPv3):

Join group 233.1.2.3 from source 10.0.1.100
→ Receive only packets sent to 233.1.2.3 by source 10.0.1.100

SSM is critical for market data because:

  1. It prevents spoofing - only the exchange’s authorized source IP can deliver data to your group
  2. It avoids routing storms - ASM requires interdomain multicast routing (PIM-SM) which is fragile; SSM works with simpler PIM-SSM
  3. It simplifies filtering - you don’t need an ACL to block malicious multicast sources

To join an SSM group in Linux:

import socket
import struct

def join_ssm_group(
    sock: socket.socket,
    interface_ip: str,
    multicast_group: str,
    source_ip: str,
) -> None:
    """
    Join a Source-Specific Multicast group.
    Requires Python 3.9+ and Linux with kernel >= 2.4.26.
    """
    # IP_ADD_SOURCE_MEMBERSHIP = 39
    # struct ip_mreq_source {
    #   in_addr imr_multiaddr;  /* multicast group */
    #   in_addr imr_interface;  /* local interface */
    #   in_addr imr_sourceaddr; /* source address */
    # }
    mcast_group_packed = socket.inet_aton(multicast_group)
    interface_packed = socket.inet_aton(interface_ip)
    source_packed = socket.inet_aton(source_ip)

    mreq_source = struct.pack("4s4s4s",
                               mcast_group_packed,
                               interface_packed,
                               source_packed)

    sock.setsockopt(socket.IPPROTO_IP, 39, mreq_source)


def create_multicast_receiver(
    bind_addr: str,
    port: int,
    multicast_group: str,
    source_ip: str,
    interface_ip: str,
) -> socket.socket:
    """Create a socket ready to receive SSM multicast market data."""
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)

    # Allow multiple processes to bind to the same port
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)

    # Large receive buffer for burst absorption
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 4 * 1024 * 1024)

    # Bind to the multicast group address and port
    sock.bind((multicast_group, port))

    # Join the SSM group
    join_ssm_group(sock, interface_ip, multicast_group, source_ip)

    return sock

CME Globex Multicast Architecture: The Reference Implementation

CME Globex is the world’s largest futures exchange by volume, and its MDP 3.0 market data protocol is the reference implementation for institutional multicast market data. Understanding it gives you the conceptual model for any similar system.

CME delivers market data via two independent multicast channels, called A and B feeds:

Channel A:  239.195.0.0/16 range, source 205.209.x.x
Channel B:  239.197.0.0/16 range, source 205.209.x.x (different source)

Both channels carry identical data, from independent server paths.

Each instrument group (e.g., “ES” - E-mini S&P futures) has a dedicated channel with a specific multicast group address and port:

Instrument Group 310 (ES Futures):
  Channel A: 239.195.1.1:14310
  Channel B: 239.197.1.1:24310

The exchange publishes a channel specification document (a “connectivity guide”) that maps each instrument to its channels. You subscribe to the channels for the instruments you trade.

A/B Feed Arbitration: The Algorithm

The purpose of dual feeds is redundancy, not diversity. Both feeds carry identical messages. Your client must:

  1. Accept whichever feed delivers each message first
  2. Use the other feed to fill any gaps in the primary feed
  3. Not double-process a message received on both feeds

The arbitration algorithm is based on sequence numbers:

from dataclasses import dataclass, field
from typing import Optional


@dataclass
class FeedArbitrator:
    """
    Maintains sequence state for a pair of A/B redundant multicast feeds.
    Both feeds carry identical data - accept whichever arrives first.
    """
    expected_seq: int = 1
    _received: set = field(default_factory=set)
    _gap_start: Optional[int] = None
    _gap_detected_at: Optional[float] = None

    def process_packet(
        self,
        seq: int,
        data: bytes,
        feed: str,  # 'A' or 'B'
        timestamp_ns: int,
    ) -> Optional[bytes]:
        """
        Returns the packet data if this is a new, in-order packet.
        Returns None if duplicate or out-of-order (gap fill).
        """
        if seq in self._received:
            # Duplicate from other feed - discard
            return None

        self._received.add(seq)

        if seq == self.expected_seq:
            self.expected_seq = seq + 1
            self._gap_start = None
            self._gap_detected_at = None
            return data

        elif seq > self.expected_seq:
            # Gap: messages expected_seq through seq-1 are missing
            if self._gap_start is None:
                self._gap_start = self.expected_seq
                self._gap_detected_at = timestamp_ns
                # Don't update expected_seq yet - wait for gap fill or snapshot

            # Buffer this out-of-order packet for later application
            self._buffer[seq] = data
            return None

        else:
            # seq < expected_seq: already processed (old duplicate)
            return None

    def get_gap_age_ms(self, current_ns: int) -> Optional[float]:
        if self._gap_detected_at is None:
            return None
        return (current_ns - self._gap_detected_at) / 1_000_000

Arbitration flow in practice:

t=0:   Feed A delivers seq=100, Feed B delivers seq=100 (0.1µs later)
       → Process Feed A seq=100 (first arrival)
       → Discard Feed B seq=100 (already in _received set)

t=1:   Feed A delivers seq=102
       → Gap detected: expected=101, received=102
       → Buffer seq=102, wait for seq=101

t=2:   Feed B delivers seq=101 (Feed A had packet loss)
       → seq=101 is not in _received, and is expected_seq
       → Process Feed B seq=101
       → expected_seq advances to 102
       → seq=102 is in buffer and is now expected_seq
       → Process buffered seq=102
       → expected_seq advances to 103

Recovery Channels and Snapshot Requests

When a gap can’t be filled by the alternate feed (both feeds missed the same packet - network partition affecting both paths), you need a different recovery mechanism.

CME provides two recovery mechanisms:

Recovery Channel (also multicast): A slower, redundant retransmission multicast stream that replays recent messages. You subscribe to the recovery channel’s multicast group when a gap is detected and wait for the recovery feed to retransmit the missing sequence.

Snapshot Channel: Provides full order book snapshots for recovery from large gaps. Similar to the REST snapshot mechanism described in Order Book Reconstruction at Scale, but delivered via multicast.

The gap detection policy determines when to switch to recovery:

import asyncio
import time

GAP_RECOVERY_THRESHOLD_MS = 200  # After 200ms gap, request recovery
SNAPSHOT_THRESHOLD_MS = 5000     # After 5s gap, take full snapshot

async def gap_monitor(
    arbitrator: FeedArbitrator,
    recovery_feed: 'RecoveryFeedManager',
    snapshot_feed: 'SnapshotFeedManager',
) -> None:
    while True:
        await asyncio.sleep(0.01)  # Check every 10ms
        now_ns = time.perf_counter_ns()
        gap_age_ms = arbitrator.get_gap_age_ms(now_ns)

        if gap_age_ms is None:
            continue  # No gap

        if gap_age_ms > SNAPSHOT_THRESHOLD_MS:
            await snapshot_feed.request_snapshot()
        elif gap_age_ms > GAP_RECOVERY_THRESHOLD_MS:
            await recovery_feed.subscribe_recovery(
                from_seq=arbitrator._gap_start,
            )

Why This Matters for Crypto: The Convergence Story

Current crypto exchange market data delivery:

  • REST polling (worst latency)
  • WebSocket unicast (better, but 500 clients = 500 TCP connections)
  • WebSocket with no sequence numbers (Binance sends sequence numbers; many exchanges don’t)

High-performance crypto co-location venues (like the co-location services at Equinix datacenters in Tokyo and Singapore where Bybit and OKX servers reside) will eventually offer multicast market data feeds to co-located clients. This is already happening at CME’s co-location in Aurora, IL, and is the natural evolution as crypto venues mature and attract the same institutional HFT clients that drove CME to build this infrastructure.

When that happens, understanding A/B feed arbitration, SSM, and gap detection will be essential. Even if you’re not at a co-location venue, understanding this model helps you build more resilient WebSocket clients: the same gap detection logic, the same “buffer before snapshot” pattern, the same redundant feed concept applies to any streaming market data.

The other relevant context: if you’re building infrastructure for a trading firm that operates in both traditional markets (CME, ICE, Euronext) and crypto markets, you need to speak both protocols. I built infrastructure at Akuna and Gemini that bridged these worlds - the mental model transfers directly.

Network Infrastructure Requirements

Multicast market data has specific network infrastructure requirements that don’t apply to unicast:

Managed switches with IGMP snooping: A standard L2 switch floods all multicast to all ports (treating it like broadcast). An L2 switch with IGMP snooping tracks which ports have active group memberships and only forwards multicast to ports that have joined the group. Without IGMP snooping, multicast floods your entire network segment.

PIM-SSM routing for cross-subnet delivery: If your subscriber is on a different subnet from the multicast source, you need router-level multicast routing. PIM-SSM (Protocol Independent Multicast - Source-Specific Multicast) is the protocol. Most enterprise routers support it but it requires explicit configuration.

Separate NICs for multicast receive: Your multicast receiver should use a NIC dedicated to receive, not shared with order placement or other traffic. Burst absorption and interrupt handling for high-rate multicast competes with the latency-sensitive path of order submission.

Kernel bypass for true low latency: At the lowest latency tier, multicast market data receivers use kernel bypass (DPDK or ef_vi/OpenOnload on Solarflare) to read packets directly from NIC hardware, bypassing the kernel’s network stack. This removes ~1-3µs of kernel processing latency per packet. See Solarflare ef_vi, DPDK, and AF_XDP for the implementation details.

How This Breaks in Production

1. IGMP snooping disabled - multicast floods the network Symptom: All hosts on a network segment receive market data packets, regardless of whether they’ve joined the group. CPU usage spikes on unrelated servers. Network engineers notice high broadcast/multicast traffic on switch counters. Root cause: Switch’s IGMP snooping is disabled or not configured. All multicast treated as broadcast. Fix: Enable IGMP snooping on all switches in the path. Verify with: ip maddr show on the receiver (should show the specific group) and check switch port statistics.

2. Both A and B feeds drop the same packet Symptom: Gap detector fires. Neither feed delivers the missing sequence. Recovery channel also shows the gap (the packet was never sent, due to an exchange-side issue). Root cause: The exchange had a momentary outage affecting both A and B feed servers simultaneously (rare but happens during releases and failovers). The packet was lost before being sent to either feed. Fix: Always implement snapshot recovery as a fallback when gap age exceeds 5 seconds. Don’t assume the alternate feed will always fill the gap.

3. Clock skew between A and B feeds causes incorrect arbitration ordering Symptom: During high-throughput periods, occasional out-of-order application of updates. Position appears inconsistent for 100-200ms then self-corrects. Root cause: Your arbitrator uses time.perf_counter_ns() for tie-breaking between feeds. But the A and B feed servers have a 50µs clock difference, so identical packets are timestamped differently at the source. Your arbitrator processes them in timestamp order, which is wrong - they’re identical packets. Fix: Use sequence number as the sole arbitration criterion. Timestamps are only for gap age calculation, not for ordering identical messages.

4. Recovery channel subscription loop under heavy load Symptom: During a volatile market, your system enters a subscription loop: gap detected, subscribe to recovery, recovery arrives, unsub; another gap, subscribe again. The repeated subscribe/unsubscribe operations themselves add latency. Root cause: Recovery channel subscription has network-level join latency (IGMP join propagation: 100-500ms). If your gap detection threshold is too short, you join recovery before the alternate feed has time to fill the gap via normal A/B arbitration. Fix: Increase gap detection threshold to at least 200ms (10-20x the typical A/B feed jitter). A/B feeds should fill gaps within 1-5ms in a healthy network. Recovery is a fallback for serious drops, not micro-jitter.

5. Multicast group address conflict across environments Symptom: In staging, your client receives production market data. Or vice versa. Or two applications receive each other’s multicast data and crash due to unexpected message formats. Root cause: Development and production multicast groups share the same address. When both environments are active on the same network, both receive both streams. Fix: Use non-overlapping multicast address ranges per environment. Document the range allocation (e.g., 239.10.x.x for dev, 239.20.x.x for staging, 239.195.x.x for production CME-format).

6. Socket receive buffer overflow during burst Symptom: During market open or major macro event, gap detection fires even though neither feed dropped packets. The packets are in the kernel socket receive buffer but your application hasn’t read them yet. Root cause: Your application reader is slow (Python GIL, disk I/O, etc.) relative to the incoming packet rate. Packets accumulate in the socket buffer. When the buffer fills (net.core.rmem_max), new packets are dropped at the kernel level. Fix: Use a dedicated high-priority thread or kernel-bypass path for market data receive. Set SO_RCVBUF to at least 4 MB. Monitor /proc/net/udp for RcvbufErrors.


For the clock synchronization that makes multicast timestamps meaningful, see PTP in Production with Solarflare. For the order book reconstruction you perform on top of multicast data, see Order Book Reconstruction at Scale. For the kernel bypass infrastructure that eliminates processing latency on the receive path, see Solarflare ef_vi, DPDK, and AF_XDP.

Continue Reading

Enjoyed this?

Get one deep infrastructure insight per week.

Free forever. Unsubscribe anytime.

You're in. Check your inbox.