The data speaks. We translate.
Every claim is grounded in walk-forward validated data, ECE-first methodology, and reproducible analysis. No hand-waving. No overfitting. No storytelling without evidence.
What the data reveals
The inverse-S calibration curve
Prediction markets systematically overprice events at extreme probabilities and underprice events near 50/50. This is consistent with prospect theory's probability weighting function — a structural bias, not random noise.
Divergences widen, not converge
PM vs. options-implied probability gaps expand as expiry approaches — contradicting efficient market theory. The gap isn't noise to filter; it's signal to exploit. Different market microstructures produce systematically different prices.
The hump-shape adverse selection pattern
The informed/uninformed trader gap peaks at intermediate conviction levels — a clean empirical isolation of the Glosten-Milgrom mechanism. Fill-or-kill orders are adversely selected when the signal is ambiguous, not when it's extreme.
Ensemble outperforms any single source
A stacking ensemble combining PM prices, options-implied probabilities, and microstructure features achieves 16% better calibration (ECE) than any individual market. The whole is genuinely greater than the sum of its parts.
The Consensus Probability Framework
Five independent probability sources, each with different information sets, fused into a single calibrated estimate.
PM Mid-Price
The prediction market order book midpoint — the most liquid and fastest-updating probability estimate. Captures crowd consensus but carries the inverse-S calibration bias documented in our research.
Call Spread Implied Probability
Derived from options markets using call spread pricing. Reflects the institutional view on the same event, priced through a fundamentally different mechanism with different participants and different regulatory constraints.
Breeden-Litzenberger Density
Extracts the full risk-neutral probability distribution from the options smile. Provides not just a point estimate but the market's entire belief about the outcome distribution — revealing tail risk and skew that point estimates miss.
Historical Lookup Table
Base rates computed from thousands of resolved markets. When BTC is at price X with volatility Y and the PM mid is Z, what actually happened historically? This grounds the model in empirical frequency, not just current market opinion.
ML Stacking Ensemble
A meta-learner that combines all four sources with microstructure features (spread, depth, trade flow) to produce the final calibrated probability. Trained with ECE as the primary objective, not accuracy — because in probability estimation, calibration is what matters.
Research pipeline
Calibration & Prospect Theory in Prediction Markets
Examining how prediction market prices reflect known behavioral biases in probability weighting, and what that means for calibration across market types.
Adverse Selection & the Hump-Shape Pattern
Empirical isolation of Glosten-Milgrom adverse selection in PM orderbooks. The informed/uninformed gap peaks at intermediate conviction — a 20pp fill-conditional accuracy gap.
ML Probability Estimation vs. Market Prices
A stacking ensemble combining PM, options, and microstructure features. 16% ECE improvement over raw PM prices. Walk-forward validated, no lookahead bias.
Price Discovery Speed in Dual-Market Settings
Cross-correlation analysis of PM vs. BTC spot price reaction times. PM reacts within 1-2 seconds. Lead-lag dynamics, Granger causality, and information flow measurement.
Interested in our research?
We work with investors, firms, and researchers who want to understand what prediction markets are really saying.
Get in Touch