Research

The data speaks. We translate.

Every claim is grounded in walk-forward validated data, ECE-first methodology, and reproducible analysis. No hand-waving. No overfitting. No storytelling without evidence.

Key Findings

What the data reveals

Finding 01

The inverse-S calibration curve

Prediction markets systematically overprice events at extreme probabilities and underprice events near 50/50. This is consistent with prospect theory's probability weighting function — a structural bias, not random noise.

Finding 02

Divergences widen, not converge

PM vs. options-implied probability gaps expand as expiry approaches — contradicting efficient market theory. The gap isn't noise to filter; it's signal to exploit. Different market microstructures produce systematically different prices.

Finding 03

The hump-shape adverse selection pattern

The informed/uninformed trader gap peaks at intermediate conviction levels — a clean empirical isolation of the Glosten-Milgrom mechanism. Fill-or-kill orders are adversely selected when the signal is ambiguous, not when it's extreme.

Finding 04

Ensemble outperforms any single source

A stacking ensemble combining PM prices, options-implied probabilities, and microstructure features achieves 16% better calibration (ECE) than any individual market. The whole is genuinely greater than the sum of its parts.

Methodology

The Consensus Probability Framework

Five independent probability sources, each with different information sets, fused into a single calibrated estimate.

PM Mid-Price

The prediction market order book midpoint — the most liquid and fastest-updating probability estimate. Captures crowd consensus but carries the inverse-S calibration bias documented in our research.

Call Spread Implied Probability

Derived from options markets using call spread pricing. Reflects the institutional view on the same event, priced through a fundamentally different mechanism with different participants and different regulatory constraints.

Breeden-Litzenberger Density

Extracts the full risk-neutral probability distribution from the options smile. Provides not just a point estimate but the market's entire belief about the outcome distribution — revealing tail risk and skew that point estimates miss.

Historical Lookup Table

Base rates computed from thousands of resolved markets. When BTC is at price X with volatility Y and the PM mid is Z, what actually happened historically? This grounds the model in empirical frequency, not just current market opinion.

ML Stacking Ensemble

A meta-learner that combines all four sources with microstructure features (spread, depth, trade flow) to produce the final calibrated probability. Trained with ECE as the primary objective, not accuracy — because in probability estimation, calibration is what matters.

Papers

Research pipeline

In Progress

Calibration & Prospect Theory in Prediction Markets

Examining how prediction market prices reflect known behavioral biases in probability weighting, and what that means for calibration across market types.

In ProgressStrongest Candidate

Adverse Selection & the Hump-Shape Pattern

Empirical isolation of Glosten-Milgrom adverse selection in PM orderbooks. The informed/uninformed gap peaks at intermediate conviction — a 20pp fill-conditional accuracy gap.

Data Collection

ML Probability Estimation vs. Market Prices

A stacking ensemble combining PM, options, and microstructure features. 16% ECE improvement over raw PM prices. Walk-forward validated, no lookahead bias.

Planned

Price Discovery Speed in Dual-Market Settings

Cross-correlation analysis of PM vs. BTC spot price reaction times. PM reacts within 1-2 seconds. Lead-lag dynamics, Granger causality, and information flow measurement.

20,000+

Training windows

835

Resolutions

167

Markets tracked

3.3 GB

Frozen dataset

918

Articles ingested

Validation rules

Interested in our research?

We work with investors, firms, and researchers who want to understand what prediction markets are really saying.

Get in Touch