Table of Contents
Fetching ...

Political-LLM: Large Language Models in Political Science

Lincan Li, Jiaqi Li, Catherine Chen, Fred Gui, Hongjia Yang, Chenxiao Yu, Zhengguang Wang, Jianing Cai, Junlong Aaron Zhou, Bolin Shen, Alex Qian, Weixin Chen, Zhongkai Xue, Lichao Sun, Lifang He, Hanjie Chen, Kaize Ding, Zijian Du, Fangzhou Mu, Jiaxin Pei, Jieyu Zhao, Swabha Swayamdipta, Willie Neiswanger, Hua Wei, Xiyang Hu, Shixiang Zhu, Tianlong Chen, Yingzhou Lu, Yang Shi, Lianhui Qin, Tianfan Fu, Zhengzhong Tu, Yuzhe Yang, Jaemin Yoo, Jiaheng Zhang, Ryan Rossi, Liang Zhan, Liang Zhao, Emilio Ferrara, Yan Liu, Furong Huang, Xiangliang Zhang, Lawrence Rothenberg, Shuiwang Ji, Philip S. Yu, Yue Zhao, Yushun Dong

TL;DR

This survey introduces Political-LLM, a principled framework for integrating large language models into computational political science. It builds a two-part taxonomy distinguishing political science functionalities from LLM-driven computational methods, and surveys applications across predictive, generative, simulation, and causal-inference tasks while addressing ethical and societal implications. It covers technical foundations, including benchmark datasets, data preparation, fine-tuning, zero-/few-shot inference, and advanced inference techniques, and includes an empirical ANES-based voting-simulation case study. Finally, it outlines future directions—modular pipelines, domain-specific data generation, bias mitigation, explainability, and novel evaluation criteria—that aim to advance responsible, impactful use of AI in political science.

Abstract

In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer science and political science--present the first principled framework termed Political-LLM to advance the comprehensive understanding of integrating LLMs into computational political science. Specifically, we first introduce a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies. In particular, from the political science perspective, we highlight the role of LLMs in automating predictive and generative tasks, simulating behavior dynamics, and improving causal inference through tools like counterfactual generation; from a computational perspective, we introduce advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts. We identify key challenges and future directions, emphasizing the development of domain-specific datasets, addressing issues of bias and fairness, incorporating human expertise, and redefining evaluation criteria to align with the unique requirements of computational political science. Political-LLM seeks to serve as a guidebook for researchers to foster an informed, ethical, and impactful use of Artificial Intelligence in political science. Our online resource is available at: http://political-llm.org/.

Political-LLM: Large Language Models in Political Science

TL;DR

This survey introduces Political-LLM, a principled framework for integrating large language models into computational political science. It builds a two-part taxonomy distinguishing political science functionalities from LLM-driven computational methods, and surveys applications across predictive, generative, simulation, and causal-inference tasks while addressing ethical and societal implications. It covers technical foundations, including benchmark datasets, data preparation, fine-tuning, zero-/few-shot inference, and advanced inference techniques, and includes an empirical ANES-based voting-simulation case study. Finally, it outlines future directions—modular pipelines, domain-specific data generation, bias mitigation, explainability, and novel evaluation criteria—that aim to advance responsible, impactful use of AI in political science.

Abstract

In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer science and political science--present the first principled framework termed Political-LLM to advance the comprehensive understanding of integrating LLMs into computational political science. Specifically, we first introduce a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies. In particular, from the political science perspective, we highlight the role of LLMs in automating predictive and generative tasks, simulating behavior dynamics, and improving causal inference through tools like counterfactual generation; from a computational perspective, we introduce advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts. We identify key challenges and future directions, emphasizing the development of domain-specific datasets, addressing issues of bias and fairness, incorporating human expertise, and redefining evaluation criteria to align with the unique requirements of computational political science. Political-LLM seeks to serve as a guidebook for researchers to foster an informed, ethical, and impactful use of Artificial Intelligence in political science. Our online resource is available at: http://political-llm.org/.

Paper Structure

This paper contains 31 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: LLMs are revolutionizing political science through advanced language analysis and interdisciplinary integration capabilities.
  • Figure 2: The proposed Taxonomy on LLM for Political Science.
  • Figure 3: The workflow of LLM-based automated predictive task, using the U.S. Presidential Election prediction as an example.
  • Figure 4: Workflow for LLM-based generative tasks, illustrating the synthesis of political speeches with specific ideology, style, and focus of content.
  • Figure 5: Illustration of the OpinionQA dataset preparation on publicly available data source.
  • ...and 4 more figures