Table of Contents
Fetching ...

Semantic Trading: Agentic AI for Clustering and Relationship Discovery in Prediction Markets

Agostino Capponi, Alfio Gliozzo, Brian Zhu

TL;DR

Prediction markets suffer from fragmentation across related propositions, hindering discovery and hedging. The authors introduce an agentic AI pipeline that semantically clusters markets and discovers within-cluster relationships, using a modular MCP-based workflow to generate same/different outcome hypotheses. They validate the approach on resolved Polymarket data, showing relationship accuracy around 60–70% and a simple leader–follower trading rule producing meaningful weekly returns, with performance varying by month. The work demonstrates a scalable, auditable AI-driven discovery layer that can enhance search, hedging, and risk management in prediction-market infrastructures, while outlining avenues for handling distribution shifts and multi-outcome events.

Abstract

Prediction markets allow users to trade on outcomes of real-world events, but are prone to fragmentation through overlapping questions, implicit equivalences, and hidden contradictions across markets. We present an agentic AI pipeline that autonomously (i) clusters markets into coherent topical groups using natural-language understanding over contract text and metadata, and (ii) identifies within-cluster market pairs whose resolved outcomes exhibit strong dependence, including same-outcome (correlated) and different-outcome (anti-correlated) relationships. Using a historical dataset of resolved markets on Polymarket, we evaluate the accuracy of the agent's relational predictions. We then translate discovered relationships into a simple trading strategy to quantify how these relationships map to actionable signals. Results show that agent-identified relationships achieve roughly 60-70% accuracy, and their induced trading strategies earn about 20% average returns over week-long horizons, highlighting the ability of agentic AI and large language models to uncover latent semantic structure in prediction markets.

Semantic Trading: Agentic AI for Clustering and Relationship Discovery in Prediction Markets

TL;DR

Prediction markets suffer from fragmentation across related propositions, hindering discovery and hedging. The authors introduce an agentic AI pipeline that semantically clusters markets and discovers within-cluster relationships, using a modular MCP-based workflow to generate same/different outcome hypotheses. They validate the approach on resolved Polymarket data, showing relationship accuracy around 60–70% and a simple leader–follower trading rule producing meaningful weekly returns, with performance varying by month. The work demonstrates a scalable, auditable AI-driven discovery layer that can enhance search, hedging, and risk management in prediction-market infrastructures, while outlining avenues for handling distribution shifts and multi-outcome events.

Abstract

Prediction markets allow users to trade on outcomes of real-world events, but are prone to fragmentation through overlapping questions, implicit equivalences, and hidden contradictions across markets. We present an agentic AI pipeline that autonomously (i) clusters markets into coherent topical groups using natural-language understanding over contract text and metadata, and (ii) identifies within-cluster market pairs whose resolved outcomes exhibit strong dependence, including same-outcome (correlated) and different-outcome (anti-correlated) relationships. Using a historical dataset of resolved markets on Polymarket, we evaluate the accuracy of the agent's relational predictions. We then translate discovered relationships into a simple trading strategy to quantify how these relationships map to actionable signals. Results show that agent-identified relationships achieve roughly 60-70% accuracy, and their induced trading strategies earn about 20% average returns over week-long horizons, highlighting the ability of agentic AI and large language models to uncover latent semantic structure in prediction markets.

Paper Structure

This paper contains 23 sections, 4 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Leader--follower price dynamics for a discovered pair.
  • Figure 2: Relationship graph for a cluster relating to tariff policies.
  • Figure 3: Category frequencies for clusters and markets. For each category, the blue bars report the number of underlying markets (questions), while the orange bars report the number of clusters assigned to that category.
  • Figure 4: Mean relationship-prediction accuracy by category. Bars show the average accuracy (percent correct) of the agent's predicted relationships, aggregated over all clusters within each category label.