FlockVote: LLM-Empowered Agent-Based Modeling for Simulating U.S. Presidential Elections
Lingfeng Zhou, Yi Xu, Zhenyu Wang, Dequan Wang
TL;DR
FlockVote tackles the challenge of balancing predictive capability with interpretability in election modeling by using LLM-driven agents endowed with high-fidelity demographic profiles and contextual policy information. The framework treats agent reasoning as a computable laboratory, enabling macro-level fidelity to real election outcomes and micro-level insight into agent rationales. Through macro-level replication, micro-level interpretability, and systematic reliability analyses (bias, instability, and context sensitivity), the work demonstrates both the potential and the current limitations of LLM agents in high-stakes social simulations. The authors advocate for broader applications beyond politics and propose methodological steps toward auditing and mitigating model flaws to advance computational social science.
Abstract
Modeling complex human behavior, such as voter decisions in national elections, is a long-standing challenge for computational social science. Traditional agent-based models (ABMs) are limited by oversimplified rules, while large-scale statistical models often lack interpretability. We introduce FlockVote, a novel framework that uses Large Language Models (LLMs) to build a "computational laboratory" of LLM agents for political simulation. Each agent is instantiated with a high-fidelity demographic profile and dynamic contextual information (e.g. candidate policies), enabling it to perform nuanced, generative reasoning to simulate a voting decision. We deploy this framework as a testbed on the 2024 U.S. Presidential Election, focusing on seven key swing states. Our simulation's macro-level results successfully replicate the real-world outcome, demonstrating the high fidelity of our "virtual society". The primary contribution is not only the prediction, but also the framework's utility as an interpretable research tool. FlockVote moves beyond black-box outputs, allowing researchers to probe agent-level rationale and analyze the stability and sensitivity of LLM-driven social simulations.
