Towards More Accurate US Presidential Election via Multi-step Reasoning with Large Language Models
Chenxiao Yu, Zhaotian Weng, Yuangang Li, Zheng Li, Xiyang Hu, Yue Zhao
TL;DR
The paper tackles predicting US presidential outcomes with large language models (LLMs) by addressing data scarcity and evolving political contexts through a multi-step reasoning framework. It fuses real-world ANES time-series data with SynC-generated synthetic populations and compares three prompting pipelines that progressively add temporal context and Chain-of-Thought reasoning, with state-level aggregation to reflect election dynamics. The multi-step V3 pipeline—placing voters on a Conservative-Liberal spectrum and then simulating votes with time-aware prompts—achieves the best alignment with ground-truth results, attaining high discriminative performance (e.g., AUC around $0.90$ on state-level predictions) and strong performance in swing states. This work demonstrates a scalable, privacy-preserving approach to political forecasting using LLMs and points to future enhancements including multi-LLM ensembles and refined temporal models to further improve reliability and reduce biases.
Abstract
Can Large Language Models (LLMs) accurately predict election outcomes? While LLMs have demonstrated impressive performance in various domains, including healthcare, legal analysis, and creative tasks, their ability to forecast elections remains unknown. Election prediction poses unique challenges, such as limited voter-level data, rapidly changing political landscapes, and the need to model complex human behavior. To address these challenges, we introduce a multi-step reasoning framework designed for political analysis. Our approach is validated on real-world data from the American National Election Studies (ANES) 2016 and 2020, as well as synthetic personas generated by the leading machine learning framework, offering scalable datasets for voter behavior modeling. To capture temporal dynamics, we incorporate candidates' policy positions and biographical details, ensuring that the model adapts to evolving political contexts. Drawing on Chain of Thought prompting, our multi-step reasoning pipeline systematically integrates demographic, ideological, and time-dependent factors, enhancing the model's predictive power.
