Table of Contents
Fetching ...

LLM as a Risk Manager: LLM Semantic Filtering for Lead-Lag Trading in Prediction Markets

Sumin Kim, Minjae Kim, Jihoon Kwon, Yoon Kim, Nicole Kagan, Joo Won Lee, Oscar Levy, Alejandro Lopez-Lira, Yongjae Lee, Chanyeol Choi

TL;DR

The paper tackles the instability and fragility of lead--lag discovery in prediction-market time series by proposing a two-stage framework that first uses Granger causality to identify candidate leader--follower pairs and then employs an LLM to semantically re-rank these candidates based on plausible economic transmission mechanisms. This semantic filtering acts as a robustness layer, prioritizing relationships likely to generalize under changing market conditions. Empirical results on Kalshi Economics markets show that LLM-based re-ranking substantially reduces downside risk and improves total returns, with win-rate gains distributed across moves of varying magnitudes. The work demonstrates that LLMs can function as semantic risk managers on top of statistical discovery, offering practical benefits for trading strategies in event-driven markets and real-world applicability where mechanism plausibility matters.

Abstract

Prediction markets provide a unique setting where event-level time series are directly tied to natural-language descriptions, yet discovering robust lead-lag relationships remains challenging due to spurious statistical correlations. We propose a hybrid two-stage causal screener to address this challenge: (i) a statistical stage that uses Granger causality to identify candidate leader-follower pairs from market-implied probability time series, and (ii) an LLM-based semantic stage that re-ranks these candidates by assessing whether the proposed direction admits a plausible economic transmission mechanism based on event descriptions. Because causal ground truth is unobserved, we evaluate the ranked pairs using a fixed, signal-triggered trading protocol that maps relationship quality into realized profit and loss (PnL). On Kalshi Economics markets, our hybrid approach consistently outperforms the statistical baseline. Across rolling evaluations, the win rate increases from 51.4% to 54.5%. Crucially, the average magnitude of losing trades decreases substantially from 649 USD to 347 USD. This reduction is driven by the LLM's ability to filter out statistically fragile links that are prone to large losses, rather than relying on rare gains. These improvements remain stable across different trading configurations, indicating that the gains are not driven by specific parameter choices. Overall, the results suggest that LLMs function as semantic risk managers on top of statistical discovery, prioritizing lead-lag relationships that generalize under changing market conditions.

LLM as a Risk Manager: LLM Semantic Filtering for Lead-Lag Trading in Prediction Markets

TL;DR

The paper tackles the instability and fragility of lead--lag discovery in prediction-market time series by proposing a two-stage framework that first uses Granger causality to identify candidate leader--follower pairs and then employs an LLM to semantically re-rank these candidates based on plausible economic transmission mechanisms. This semantic filtering acts as a robustness layer, prioritizing relationships likely to generalize under changing market conditions. Empirical results on Kalshi Economics markets show that LLM-based re-ranking substantially reduces downside risk and improves total returns, with win-rate gains distributed across moves of varying magnitudes. The work demonstrates that LLMs can function as semantic risk managers on top of statistical discovery, offering practical benefits for trading strategies in event-driven markets and real-world applicability where mechanism plausibility matters.

Abstract

Prediction markets provide a unique setting where event-level time series are directly tied to natural-language descriptions, yet discovering robust lead-lag relationships remains challenging due to spurious statistical correlations. We propose a hybrid two-stage causal screener to address this challenge: (i) a statistical stage that uses Granger causality to identify candidate leader-follower pairs from market-implied probability time series, and (ii) an LLM-based semantic stage that re-ranks these candidates by assessing whether the proposed direction admits a plausible economic transmission mechanism based on event descriptions. Because causal ground truth is unobserved, we evaluate the ranked pairs using a fixed, signal-triggered trading protocol that maps relationship quality into realized profit and loss (PnL). On Kalshi Economics markets, our hybrid approach consistently outperforms the statistical baseline. Across rolling evaluations, the win rate increases from 51.4% to 54.5%. Crucially, the average magnitude of losing trades decreases substantially from 649 USD to 347 USD. This reduction is driven by the LLM's ability to filter out statistically fragile links that are prone to large losses, rather than relying on rare gains. These improvements remain stable across different trading configurations, indicating that the gains are not driven by specific parameter choices. Overall, the results suggest that LLMs function as semantic risk managers on top of statistical discovery, prioritizing lead-lag relationships that generalize under changing market conditions.
Paper Structure (34 sections, 5 equations, 3 figures, 4 tables)

This paper contains 34 sections, 5 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Two-stage framework for leader–follower pair discovery in prediction markets. Stage 1 produces a candidate set of Top K directed pairs (K=100) ranked by Granger significance, and Stage 2 applies LLM-based semantic re-ranking to select the final Top M portfolio (M=20).
  • Figure 2: Signal-triggered trading protocol used to evaluate ranked lead-lag relationships from Figure \ref{['fig:framework']}: leader price moves trigger follower trades, with direction determined by the Granger-induced trade sign and out-of-sample PnL used to evaluate the ranked pair list.
  • Figure 3: Prompt template used for LLM-based semantic filtering. Given a directed leader--follower event pair, the model assesses whether a plausible economic transmission mechanism exists (beyond correlation), assigns a strength level, and predicts the expected sign of co-movement, returning a structured JSON output.