5 HFT Secrets Every Quant Trader Should Know
From order flow to lock-free buffers — the building blocks of high-frequency trading
Whether you're prepping for a quant trading interview or sharpening your market microstructure intuition, these five drills cover the core building blocks of high-frequency trading: reading order flow, detecting toxic liquidity, estimating fair value, quoting optimally, and building the infrastructure that makes it all run in nanoseconds.
Each drill includes the problem statement, the key insight, and a working Python implementation.
1. Order Flow Imbalance (OFI)
The drill: Compute OFI from L1 order book snapshots (best bid/ask price & size).
Why it matters: OFI, introduced by Cont, Kukanov & Stoikov (2014), captures supply/demand shifts at the top of book. When the bid queue grows or the ask queue shrinks, buying pressure increases. A simple linear regression of price changes on OFI explains 40-65% of short-term price variance — making it one of the most powerful microstructure signals in production systems.
The key insight is tracking changes in queue sizes, conditioned on whether the price level itself moved.
OFI(t) = ΔBid(t) − ΔAsk(t)
where ΔBid = Qbid · 1{Pbid ↑} + (Qbid − Qbid,prev) · 1{Pbid =} − Qbid,prev · 1{Pbid ↓}
ΔP = β · OFI + ε (R² ≈ 0.40–0.65 at 10s horizons)
import numpy as np, pandas as pd
def compute_ofi(ob: pd.DataFrame) -> pd.Series:
"""ob must have columns: bid_price, bid_size, ask_price, ask_size"""
bp, bs = ob['bid_price'], ob['bid_size']
ap, asize = ob['ask_price'], ob['ask_size']
# Bid side contribution
bid_up = (bp > bp.shift(1)).astype(int)
bid_same = (bp == bp.shift(1)).astype(int)
bid_dn = (bp < bp.shift(1)).astype(int)
ofi_bid = bid_up * bs + bid_same * (bs - bs.shift(1)) - bid_dn * bs.shift(1)
# Ask side contribution
ask_dn = (ap < ap.shift(1)).astype(int)
ask_same = (ap == ap.shift(1)).astype(int)
ask_up = (ap > ap.shift(1)).astype(int)
ofi_ask = ask_dn * asize + ask_same * (asize - asize.shift(1)) - ask_up * asize.shift(1)
return ofi_bid - ofi_ask # positive = net buying pressureO(n) time · O(1) space per tick
2. VPIN — Volume-Synchronized Probability of Informed Trading
The drill: Implement VPIN using Bulk Volume Classification on trade data.
Why it matters: VPIN (Easley, Lopez de Prado, O'Hara 2012) answers a critical question: is the flow hitting my quotes informed or noise? It groups trades into equal-volume buckets (volume-time instead of clock-time) and measures how one-sided each bucket is. VPIN spiked dramatically before the 2010 Flash Crash — it's now a standard toxicity metric for market makers deciding whether to widen spreads or pull quotes entirely.
VPIN = (1/n) · ∑τ=1..n |Vbuy(τ) − Vsell(τ)| / Vbucket
Vbuy = V · Φ(ΔP / σ) Vsell = V · (1 − Φ(ΔP / σ))
where Φ is the standard normal CDF (Bulk Volume Classification)
import numpy as np
from scipy.stats import norm
def compute_vpin(prices, volumes, bucket_size, n_buckets=50):
"""
prices, volumes: arrays of trade prices and quantities.
bucket_size: volume per bucket (e.g. daily_vol / 50).
"""
# BVC: classify each trade's volume as buy or sell
dp = np.diff(prices, prepend=prices[0])
sigma = np.std(dp[dp != 0]) or 1e-10
buy_pct = norm.cdf(dp / sigma)
buy_vol = volumes * buy_pct
sell_vol = volumes * (1 - buy_pct)
# Accumulate into equal-volume buckets
cum_vol = np.cumsum(volumes)
bucket_ids = (cum_vol // bucket_size).astype(int)
imbalances = []
for b in range(bucket_ids[-1] + 1):
mask = bucket_ids == b
bv = buy_vol[mask].sum()
sv = sell_vol[mask].sum()
imbalances.append(abs(bv - sv))
imb = np.array(imbalances)
# Rolling VPIN over n_buckets
if len(imb) < n_buckets:
return np.array([imb.sum() / (bucket_size * len(imb))])
vpin = np.convolve(imb, np.ones(n_buckets), 'valid') / (n_buckets * bucket_size)
return vpin
# VPIN near 0 = balanced flow; near 1 = highly toxicO(n) time · O(B) space where B = num buckets
3. Microprice
The drill: Compute the microprice from L1 order book data.
Why it matters: The midprice treats bid and ask equally, but that's wrong when queue sizes differ. Stoikov (2018) showed that the microprice — which weights each side by the opposite side's depth — is a better short-term price predictor because it incorporates queue imbalance. If bid size is much larger than ask size, the microprice sits above the mid, correctly predicting upward pressure. It's a martingale by construction and the simplest meaningful upgrade over the midprice.
Pmicro = Pask · Qbid / (Qbid + Qask) + Pbid · Qask / (Qbid + Qask)
Pmicro − Pmid = spread · (imbalance − 0.5)
where imbalance = Qbid / (Qbid + Qask)
import pandas as pd
def microprice(bid_price, ask_price, bid_size, ask_size):
"""Vectorized microprice computation."""
total = bid_size + ask_size
return (bid_size * ask_price + ask_size * bid_price) / total
def microprice_signal(ob: pd.DataFrame, lag: int = 1) -> pd.Series:
"""Microprice return as a signal."""
mp = microprice(ob['bid_price'], ob['ask_price'],
ob['bid_size'], ob['ask_size'])
return mp.pct_change(lag)
# micro vs mid: micro adjusts for imbalance
# If bid_size >> ask_size, micro > mid (predicts price going up)
# Difference: micro - mid = spread * (imbalance - 0.5)O(1) per tick · O(n) vectorized
4. Avellaneda-Stoikov Optimal Quotes
The drill: Compute optimal bid/ask quotes using the Avellaneda-Stoikov model.
Why it matters: This is the foundational market making model. Avellaneda & Stoikov (2008) solved the problem of where to place your bid and ask quotes given your current inventory, the asset's volatility, and how much time remains in the trading session. The key idea: your reservation price (where you'd trade at fair value) shifts away from mid in proportion to your inventory. When you're long, you lower your ask to encourage selling. The optimal spread depends on volatility, risk aversion, and order arrival intensity.
r(t) = S(t) − q · γ · σ² · (T − t)
δ = γ · σ² · (T − t) + (2/γ) · ln(1 + γ/k)
bid = r − δ/2 ask = r + δ/2
where q = inventory, γ = risk aversion, k = order arrival intensity
import numpy as np
def avellaneda_stoikov(mid, inventory, sigma, gamma, k, T_rem):
reservation = mid - inventory * gamma * sigma**2 * T_rem
spread = gamma * sigma**2 * T_rem + (2/gamma) * np.log(1 + gamma/k)
bid = reservation - spread / 2
ask = reservation + spread / 2
return {'bid': bid, 'ask': ask, 'reservation': reservation,
'spread': ask - bid, 'skew': mid - reservation}O(1) time · O(1) space
5. SPSC Lock-Free Ring Buffer
The drill: Implement a single-producer single-consumer lock-free ring buffer.
Why it matters: Every HFT system needs to pass market data from a feed handler to a strategy engine with minimal latency. The SPSC ring buffer is the fundamental IPC primitive that makes this possible — zero locks, zero allocations, deterministic latency. The producer writes at the tail, the consumer reads at the head, and neither ever waits for the other. In C++, this runs at 2-5 nanoseconds per operation (1B+ ops/sec). The critical optimization is cache-line padding between head and tail to prevent false sharing.
class SPSCRingBuffer:
"""Lock-free SPSC ring buffer (Python simulation of C++ pattern)."""
def __init__(self, capacity=1024):
self.capacity = capacity
self.buffer = [None] * capacity
self.head = 0 # consumer reads here
self.tail = 0 # producer writes here
# In C++: head/tail would be std::atomic<size_t>
# with cache line padding (64 bytes) between them
def push(self, item) -> bool:
"""Producer only. Returns False if full."""
next_tail = (self.tail + 1) % self.capacity
if next_tail == self.head: # full
return False
self.buffer[self.tail] = item
# In C++: self.tail.store(next_tail, memory_order_release)
self.tail = next_tail
return True
def pop(self):
"""Consumer only. Returns None if empty."""
if self.head == self.tail: # empty
return None
item = self.buffer[self.head]
# In C++: self.head.store(next, memory_order_release)
self.head = (self.head + 1) % self.capacity
return item
# C++ version: ~2-5ns per push/pop (1B+ ops/sec)
# Key: cache line padding between head and tail (64 bytes)
# struct alignas(64) { atomic<size_t> head; };
# struct alignas(64) { atomic<size_t> tail; };O(1) push/pop · zero locks · zero allocations
These five drills span the full HFT stack — from reading the order book (OFI, Microprice), to detecting when to pull quotes (VPIN), to setting optimal prices (Avellaneda-Stoikov), to building the infrastructure that delivers data in nanoseconds (SPSC buffer). Master these and you'll have a solid foundation for any quant trading interview or system design discussion.
This post is part of the Quant Trading Drill Series — 150 hands-on coding exercises covering microstructure, statistical testing, market making, ML for alpha, and HFT systems.
Related Articles
Building a Market-Maker on Hyperliquid — Part III: The Backtester — Building and backtesting a crypto market-making engine
Crypto Orderflow Alpha Report — Feb 2026 — Decoding crypto market microstructure through order flow
Building a Production-Grade Data Streamer for Hyperliquid — Building a real-time WebSocket data pipeline for crypto trading

