Side-Channel Resistance and Constant-Time Crypto for Signing Services

When building ZeroCopy’s signing service, the constant-time requirement was non-negotiable. Not because we expected a local attacker with oscilloscopes - we are running in a Nitro Enclave, not embedded hardware. The requirement exists because timing side channels can be exploited remotely, over a network, with high precision, and because a signing service that leaks timing information about its key material will eventually be attacked.

This post covers what side-channel attacks are, which ones are relevant to a cloud-based signing service, and how to implement signing operations that resist them. The key mental model: if your code takes different amounts of time depending on the value of a secret, an attacker who can measure your response time can eventually recover the secret.

What Side-Channel Attacks Are

A side channel is any observable property of a computation that is not its intended output. The three most relevant for cryptographic signing services:

Timing side channels: the computation takes different time depending on secret values. An attacker who sends millions of queries and measures response times can recover information about the secret through statistical analysis.

Cache side channels: the computation accesses different memory locations depending on secret values. In a shared CPU, an attacker’s code can measure which cache lines were recently used (via timing of their own memory accesses) and infer the victim’s memory access pattern. Spectre/Meltdown are cache side-channel attacks at the hardware level.

Power/EM side channels: the computation consumes different amounts of power depending on secret values. Relevant for hardware (embedded signing devices, HSMs), but not for cloud services where you cannot measure power consumption.

For a cloud signing service, timing and cache side channels are the relevant threat models.

Timing Side Channels: The Core Problem

The fundamental rule: no computation path that depends on secret data may have variable timing.

This sounds simple. The devil is in what “depends on” means for modern CPUs.

// VULNERABLE: the branch is visible in timing
int ecdsa_sign_vulnerable(const uint8_t *key, const uint8_t *hash, uint8_t *sig) {
    uint8_t result[32];
    scalar_multiply(key, hash, result);  // Scalar mult on secret key

    if (result[0] == 0) {  // Secret-dependent branch!
        // Retry with different nonce
        return ecdsa_sign_vulnerable(key, hash, sig);  // Different timing path
    }

    memcpy(sig, result, 32);
    return 0;
}

The branch on result[0] == 0 is a function of the secret key and nonce. It takes a different code path depending on those secret values. An attacker who measures 10 million signing requests will observe that some take slightly longer (the retry path). The distribution of these timing observations leaks information about the secret key’s bit pattern.

But even without explicit branches, the following are dangerous:

// VULNERABLE: Early termination on key comparison (memcmp)
if (memcmp(expected_sig, received_sig, 32) == 0) {
    // memcmp stops at the first differing byte
    // Timing reveals how many leading bytes match
    return VALID;
}

// VULNERABLE: Array indexing by secret value
// The cache state after this reveals which index was accessed
const uint64_t TABLE[256] = { ... };
uint64_t val = TABLE[secret_byte];  // Cache timing reveals secret_byte

// VULNERABLE: Variable-time division
// CPU's division instruction has variable timing on many architectures
// Avoid when the operands are derived from secret data
uint64_t result = secret_value % modulus;

Constant-Time Primitives

Constant-Time Comparison

// Rust: use the `subtle` crate for constant-time operations
use subtle::{ConstantTimeEq, ConstantTimeLess};

fn verify_signature_constant_time(
    expected: &[u8; 32],
    received: &[u8; 32],
) -> bool {
    // This comparison takes the same time regardless of how many bytes match
    expected.ct_eq(received).into()
}

// In C, use sodium_memcmp or a manual implementation:
// int constant_time_memcmp(const void *a, const void *b, size_t n) {
//     const uint8_t *x = (const uint8_t *)a;
//     const uint8_t *y = (const uint8_t *)b;
//     uint8_t result = 0;
//     for (size_t i = 0; i < n; i++) {
//         result |= x[i] ^ y[i];  // XOR is 0 if equal; OR accumulates any difference
//     }
//     // result is 0 iff a == b, in constant time
//     return (int)result;
// }

Constant-Time Elliptic Curve Operations: The Montgomery Ladder

The core of ECDSA is scalar multiplication: compute k * G where k is a scalar (the private key or nonce) and G is the generator point. The naive implementation processes each bit of k and conditionally performs a point addition:

# VULNERABLE double-and-add:
Q = O  # point at infinity
for bit in bits(k):
    Q = 2*Q         # Always double
    if bit == 1:
        Q = Q + G   # Conditionally add - timing reveals key bits!

The Montgomery ladder processes every bit identically, whether it is 0 or 1:

# SAFE Montgomery ladder:
R0 = O    # represents Q
R1 = G    # represents Q + G
for bit in bits(k):
    if bit == 0:
        R1 = R0 + R1   # Point add
        R0 = 2*R0      # Point double
    else:
        R0 = R0 + R1   # Point add (same operation!)
        R1 = 2*R1      # Point double (same operation!)
    # Both branches do the same operations, just to different registers
    # The branch itself is still visible - use conditional swap instead

The final implementation avoids even the branch by using a constant-time conditional swap:

// Rust secp256k1 library (libsecp256k1) already implements constant-time scalar mult.
// Always use a vetted library; never implement scalar multiplication yourself.

use secp256k1::{Secp256k1, SecretKey, Message};

fn sign_constant_time(
    key_bytes: &[u8; 32],
    message_hash: &[u8; 32],
) -> Result<[u8; 64], secp256k1::Error> {
    // secp256k1 crate uses libsecp256k1 which is audited constant-time
    let secp = Secp256k1::signing_only();
    let key = SecretKey::from_slice(key_bytes)?;
    let msg = Message::from_digest(*message_hash);

    let sig = secp.sign_ecdsa(&msg, &key);
    Ok(sig.serialize_compact())
}

The secp256k1 crate wraps the libsecp256k1 C library, which has been formally audited for constant-time properties. The audit reports are public. This is the implementation Bitcoin Core uses.

Zeroizing Key Material After Use

Key material in memory should be zeroed as soon as it is no longer needed. Rust’s memory model helps here:

use zeroize::Zeroize;

struct SigningKey {
    key_bytes: [u8; 32],
}

impl Drop for SigningKey {
    fn drop(&mut self) {
        // Zero key material when the struct is dropped
        self.key_bytes.zeroize();
    }
}

// Or use zeroize's derive macro:
use zeroize::ZeroizeOnDrop;

#[derive(ZeroizeOnDrop)]
struct SigningKey {
    key_bytes: [u8; 32],
}

The zeroize crate uses compiler barriers (std::ptr::write_volatile + fences) to prevent the compiler from optimizing away the zeroing as “dead code.”

Spectre and Cache-Timing in Signing Services

Spectre (CVE-2017-5753) is a speculative execution attack that allows an attacker’s code to read memory that it should not have access to, by exploiting the CPU’s branch predictor and cache.

For a cloud-based signing service, the relevant threat model is:

Co-tenancy: your signing service shares a physical CPU with other tenants’ code
Spectre: a co-tenant’s malicious code trains the branch predictor in a way that causes your signing code to speculatively execute a path that loads key material into cache
Cache exfiltration: the attacker measures cache access timing to infer what memory was loaded

Mitigations deployed in production signing services:

// Retpoline (in Rust/LLVM): indirect calls use a return-prediction-hardened trampoline
// This is compiler-handled; use a recent Rust toolchain with Spectre mitigations:
// RUSTFLAGS="-C target-cpu=native -C target-feature=+speculative-load-hardening"

// Serialize: use LFENCE after sensitive branches to prevent speculative execution
// (Not directly expressible in safe Rust; requires unsafe or assembly)

// Process isolation: most important mitigation
// Run the signing service in a dedicated process or enclave (Nitro Enclave does this)
// Spectre within an enclave boundary requires significantly more attacker capability

For ZeroCopy’s signing service, the primary Spectre mitigation is the Nitro Enclave itself. The enclave VM provides process-level isolation; cross-VM Spectre attacks require the attacker to be on the same physical host AND have code running in a VM that shares cache with the enclave’s physical CPUs. Nitro Enclaves run on the Nitro Card’s separate CPU, not the shared server CPUs, providing a hardware barrier.

Practical Implementation Checklist

For any signing service handling real funds:

Control                                     Implementation
──────────────────────────────────────────────────────────────────────────────────────────────
Constant-time key comparison                subtle::ConstantTimeEq (Rust), sodium_memcmp (C)
Constant-time ECDSA                         libsecp256k1 (audited), never custom scalar mult
Constant-time Ed25519                       libsodium's crypto_sign (audited)
Key material zeroed after use               zeroize::ZeroizeOnDrop (Rust), sodium_memzero (C)
No secret-dependent array indexing          Use constant-time table lookups (subtle crate)
Dedicated process (no co-tenancy)           Separate OS process, preferably TEE
Spectre mitigation                          Retpoline (compiler), process isolation
Hardware timing (avoid CPU frequency scale) Lock CPU to fixed frequency (performance governor)
Network timing (equalize response time)     Add fixed padding to equalize response times

Network-level timing normalization is underappreciated. Even with perfectly constant-time crypto, your network response time varies because of application-level processing (policy checks, logging, queue depth). An attacker measuring network response time can observe these variations. For a signing service where timing leakage is a real concern:

use tokio::time::{sleep, Duration, Instant};

async fn sign_with_timing_normalization(
    request: SignRequest,
    min_response_time_us: u64,  // e.g., 200µs
) -> SignResponse {
    let start = Instant::now();

    let response = perform_signing(request).await;

    // Pad response time to a minimum floor
    let elapsed = start.elapsed();
    let target = Duration::from_micros(min_response_time_us);
    if elapsed < target {
        sleep(target - elapsed).await;
    }

    response
}

This normalization does not fully prevent timing attacks (an attacker can still measure >P99.9 responses that exceed the floor) but it eliminates the easily-observable systematic timing variation.

How This Breaks in Production

Failure 1: Compiler optimizing away security-critical constant-time code. You implement a constant-time comparison manually. The compiler sees that the result is “always the same regardless of the data” and optimizes it to a simple memcmp. Your “constant-time” comparison is now variable-time. Fix: use library functions explicitly designed for this purpose (subtle, libsodium, libsecp256k1) that use compiler barriers and volatile writes to prevent optimization. Never implement constant-time primitives yourself.

Failure 2: Key material in core dump. A signing service crashes under load. The OS writes a core dump containing all process memory. The key bytes are in the core dump, unprotected, on a shared filesystem. Fix: disable core dumps for signing processes (ulimit -c 0 in the process environment, or prctl(PR_SET_DUMPABLE, 0) in the process code). Use madvise(MADV_DONTDUMP) on memory regions containing key material.

Failure 3: Timing side channel via response code. The signing service returns an error immediately when the policy check fails, and takes 42µs on success. An attacker who sends malformed requests can distinguish “policy rejected” from “signing error” from “success” purely by timing. Each reveals information. Fix: normalize response time across all response codes, not just success. The timing floor should apply even to error responses.

Failure 4: Constant-time code disabled in debug mode. Your testing framework disables optimizations (-O0) which changes the timing characteristics of constant-time code. The developer tests the signing service in debug mode and does not notice that the constant-time properties are not present - because they are testing correctness, not timing. In production (-O2/--release), the compiler applies different optimizations that may break the constant-time properties. Fix: test constant-time properties under the same compiler flags as production. Automated timing tests should run in release mode.

Failure 5: Third-party library not audited for constant-time. You use a Python library for ECDSA signing because the rest of your service is Python. Python’s cryptography library uses OpenSSL’s ECDSA under the hood - which is constant-time for the scalar multiplication but the Python binding may introduce timing variations through object allocation, GC pauses, and dictionary lookups on error paths. Fix: for constant-time guarantees, signing operations should be in compiled native code (Rust, C via cffi) with direct control over the execution path. Python is not appropriate for the signing hot path in a latency-sensitive, security-critical service.

Failure 6: Signing service co-located with general computation. The signing service runs on the same physical host as a web server, a cache, and a job queue. All of these processes share the CPU’s cache. Cache side-channel attacks (Spectre, Flush+Reload) are feasible from any co-located process. Fix: signing services should run in dedicated enclaves (Nitro) or on dedicated physical machines. The value of the assets being protected justifies the infrastructure cost of dedicated compute.

Related reading: AWS Nitro Enclaves for Wallet Signing covers the enclave architecture that provides hardware isolation for the signing service. Threshold Signature Schemes covers MPC protocols that must also be implemented with constant-time operations.