Table of Contents
Fetching ...

Measuring Interest Group Positions on Legislation: An AI-Driven Analysis of Lobbying Reports

Jiseon Kim, Dongkwan Kim, Joohye Jeong, Alice Oh, In Song Kim

TL;DR

This work tackles the challenge of directly measuring special interest groups' positions on a wide array of U.S. bills by integrating LLM-based text Annotation with a heterogeneous GNN-driven network annotation pipeline. The authors construct a large-scale bill-position dataset covering 111th–117th Congresses, and derive LPscores to quantify latent SIG preferences via an IRT framework, enabling cross-bill and cross-industry analyses. Key findings reveal systematic lobbying patterns across legislative stages, firm size–driven differences in lobbying behavior, and topic- and industry-specific variations in position distributions, offering a scalable framework for studying how interest groups shape policy. The dataset and methodological toolkit promise to enhance transparency and enable future research on lobbying strategies, policy outcomes, and the political economy of legislation.

Abstract

Special interest groups (SIGs) in the U.S. participate in a range of political activities, such as lobbying and making campaign donations, to influence policy decisions in the legislative and executive branches. The competing interests of these SIGs have profound implications for global issues such as international trade policies, immigration, climate change, and global health challenges. Despite the significance of understanding SIGs' policy positions, empirical challenges in observing them have often led researchers to rely on indirect measurements or focus on a select few SIGs that publicly support or oppose a limited range of legislation. This study introduces the first large-scale effort to directly measure and predict a wide range of bill positions-Support, Oppose, Engage (Amend and Monitor)- across all legislative bills introduced from the 111th to the 117th Congresses. We leverage an advanced AI framework, including large language models (LLMs) and graph neural networks (GNNs), to develop a scalable pipeline that automatically extracts these positions from lobbying activities, resulting in a dataset of 42k bills annotated with 279k bill positions of 12k SIGs. With this large-scale dataset, we reveal (i) a strong correlation between a bill's progression through legislative process stages and the positions taken by interest groups, (ii) a significant relationship between firm size and lobbying positions, (iii) notable distinctions in lobbying position distribution based on bill subject, and (iv) heterogeneity in the distribution of policy preferences across industries. We introduce a novel framework for examining lobbying strategies and offer opportunities to explore how interest groups shape the political landscape.

Measuring Interest Group Positions on Legislation: An AI-Driven Analysis of Lobbying Reports

TL;DR

This work tackles the challenge of directly measuring special interest groups' positions on a wide array of U.S. bills by integrating LLM-based text Annotation with a heterogeneous GNN-driven network annotation pipeline. The authors construct a large-scale bill-position dataset covering 111th–117th Congresses, and derive LPscores to quantify latent SIG preferences via an IRT framework, enabling cross-bill and cross-industry analyses. Key findings reveal systematic lobbying patterns across legislative stages, firm size–driven differences in lobbying behavior, and topic- and industry-specific variations in position distributions, offering a scalable framework for studying how interest groups shape policy. The dataset and methodological toolkit promise to enhance transparency and enable future research on lobbying strategies, policy outcomes, and the political economy of legislation.

Abstract

Special interest groups (SIGs) in the U.S. participate in a range of political activities, such as lobbying and making campaign donations, to influence policy decisions in the legislative and executive branches. The competing interests of these SIGs have profound implications for global issues such as international trade policies, immigration, climate change, and global health challenges. Despite the significance of understanding SIGs' policy positions, empirical challenges in observing them have often led researchers to rely on indirect measurements or focus on a select few SIGs that publicly support or oppose a limited range of legislation. This study introduces the first large-scale effort to directly measure and predict a wide range of bill positions-Support, Oppose, Engage (Amend and Monitor)- across all legislative bills introduced from the 111th to the 117th Congresses. We leverage an advanced AI framework, including large language models (LLMs) and graph neural networks (GNNs), to develop a scalable pipeline that automatically extracts these positions from lobbying activities, resulting in a dataset of 42k bills annotated with 279k bill positions of 12k SIGs. With this large-scale dataset, we reveal (i) a strong correlation between a bill's progression through legislative process stages and the positions taken by interest groups, (ii) a significant relationship between firm size and lobbying positions, (iii) notable distinctions in lobbying position distribution based on bill subject, and (iv) heterogeneity in the distribution of policy preferences across industries. We introduce a novel framework for examining lobbying strategies and offer opportunities to explore how interest groups shape the political landscape.

Paper Structure

This paper contains 32 sections, 12 figures, 21 tables.

Figures (12)

  • Figure 1: Overview of bill position annotation pipeline. This outlines the two approaches used in our bill position annotation process. First, in LLM Annotation stage, text information from raw data sources is extracted, and positions are annotated using a large language model (LLM). When the text alone does not offer enough clarity to determine the bill position, lobbying and legislative network data are incorporated in the GNN Annotation pipeline, where a graph neural network (GNN) is used to refine the prediction. By minimizing human annotation, the pipeline is made more scalable, while still capturing the bill positions of interest groups involved in the legislative process, providing a dataset that reflects the real-world legislative activities and a wider range of lobbying types.
  • Figure 2: Frequency of interest group positions across different legislative stages. We compare the bill position distributions of interest groups across different legislative stages. The x-axis represents the stages of the legislative process, from bill introduction to enactment, with lines depicting the frequency of positions: Support (blue), Oppose (red), Amend (yellow), and Monitor (green). This analysis includes bills that were ultimately 'Enacted' and 'Vetoed'. Bills with a higher frequency of Support are more likely to advance through the legislative process ('Enacted'), while those with significant increases in Oppose or Amend typically fail to progress ('Vetoed'). We observe a clear correlation between interest group positions and the eventual legislative outcome of the bills.
  • Figure 3: Predicted probability changes in lobbying pattern between firms at the 90th and 10th percentiles of employment. The top panel presents logistic regression–based estimates for the lobbying decision, while the bottom panel shows Dirichlet regression–based estimates for four lobbying positions (Support, Oppose, Amend, Monitor). Each model uses firm‐year level data and includes year fixed effects, and error bars represent the 90% and 95% confidence intervals (thick and thin lines, respectively) obtained using block bootstrap at the firm level.
  • Figure 4: Bill position ratios across different bill subjects. This shows the bill position ratios (Support, Oppose, and Engage) for the top 15 bill subjects containing the largest number of bills. The subjects are ordered by their Support ratio, illustrating how lobbying strategies vary across different legislative topics. Taxation-related bills have the highest Support ratio, indicating strong industry backing for favorable fiscal policies. In contrast, bills concerning Public lands and natural resources exhibit a higher Oppose ratio, reflecting resistance from industries affected by environmental regulations. Meanwhile, subjects such as Economics and public finance show a greater proportion of Engage positions, as lobbying in these cases often focuses on influencing appropriation decisions and securing favorable budget allocations rather than taking a definitive stance for or against the legislation.
  • Figure 5: Industry-wise distribution of interest group preferences. The boxplot illustrates the distribution of LPscores for 1,414 interest groups across 40 industries. Industries are sorted by average scores in descending order, with economic and conservative-aligned sectors at the top and social/progressive sectors at the bottom. Conflicting interest industries (e.g., Republican vs. Democratic, Pro-Abortion vs. Anti-Abortion, Gun Rights vs. Gun Control) are distinctly positioned, highlighting ideological divides.
  • ...and 7 more figures