Table of Contents
Fetching ...

Agent-Enhanced Large Language Models for Researching Political Institutions

Joseph R. Loffredo, Suyeol Yun

TL;DR

This paper demonstrates how LLMs, when augmented with predefined functions and specialized tools, can serve as dynamic agents capable of streamlining tasks such as data collection, preprocessing, and analysis.

Abstract

The applications of Large Language Models (LLMs) in political science are rapidly expanding. This paper demonstrates how LLMs, when augmented with predefined functions and specialized tools, can serve as dynamic agents capable of streamlining tasks such as data collection, preprocessing, and analysis. Central to this approach is agentic retrieval-augmented generation (Agentic RAG), which equips LLMs with action-calling capabilities for interaction with external knowledge bases. Beyond information retrieval, LLM agents may incorporate modular tools for tasks like document summarization, transcript coding, qualitative variable classification, and statistical modeling. To demonstrate the potential of this approach, we introduce CongressRA, an LLM agent designed to support scholars studying the U.S. Congress. Through this example, we highlight how LLM agents can reduce the costs of replicating, testing, and extending empirical research using the domain-specific data that drives the study of political institutions.

Agent-Enhanced Large Language Models for Researching Political Institutions

TL;DR

This paper demonstrates how LLMs, when augmented with predefined functions and specialized tools, can serve as dynamic agents capable of streamlining tasks such as data collection, preprocessing, and analysis.

Abstract

The applications of Large Language Models (LLMs) in political science are rapidly expanding. This paper demonstrates how LLMs, when augmented with predefined functions and specialized tools, can serve as dynamic agents capable of streamlining tasks such as data collection, preprocessing, and analysis. Central to this approach is agentic retrieval-augmented generation (Agentic RAG), which equips LLMs with action-calling capabilities for interaction with external knowledge bases. Beyond information retrieval, LLM agents may incorporate modular tools for tasks like document summarization, transcript coding, qualitative variable classification, and statistical modeling. To demonstrate the potential of this approach, we introduce CongressRA, an LLM agent designed to support scholars studying the U.S. Congress. Through this example, we highlight how LLM agents can reduce the costs of replicating, testing, and extending empirical research using the domain-specific data that drives the study of political institutions.

Paper Structure

This paper contains 9 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Retrieval-Augmented Generation (RAG). RAG supplements a user's query with external information to allow an LLM to create output with up-to-date and contextually relevant content.
  • Figure 2: Agentic RAG. With a set of functions defined by the researcher and implementing an AI-assistant framework, LLMs can act as autonomous agents that dynamically decide when, where, and how to retrieve external information.
  • Figure 3: CongressRA, an LLM agent for the study of the U.S. Congress. There are four main sets of tools---along with the related external data sources---available to our LLM agent: web search, API access, SQL database querying, and vector database querying.
  • Figure 4: Using CongressRA to measure legislative gridlock. To follow the methodology binder_dynamics_1999 follows, a user submits three prompts to the LLM agent to identify salient policy issues, search for relevant legislation, and identify if any such legislation was enacted into law. This process is repeated to produce a value of legislative gridlock in each session of Congress.
  • Figure 5: Legislative Gridlock (113th-118th Congress)
  • ...and 1 more figures