Table of Contents
Fetching ...

Identifying attributions of causality in political text

Paulina Garcia-Corral

TL;DR

This paper presents a scalable framework for extracting and analyzing causal attributions in political text by fine-tuning causal language models for sequence classification and span detection to produce structured cause–effect representations. It leverages a PolitiCAUSE-inspired semantic annotation scheme and span-reconstruction to create interpretable data, validated on armed-conflict headlines from Al Jazeera, BBC, and CNN. The empirical analysis uses a log-odds framing approach with Dirichlet priors to compare how actors are framed across conflicts and outlets, revealing stable Russia-as-cause framing in Eastern Europe and outlet-specific patterns in the Middle East. The work demonstrates how causal explanations—viewed as non-neutral rhetorical devices—can be measured at scale to illuminate attribution, responsibility, and framing in political discourse, with broad methodological implications for political science and communication research.

Abstract

Explanations are a fundamental element of how people make sense of the political world. Citizens routinely ask and answer questions about why events happen, who is responsible, and what could or should be done differently. Yet despite their importance, explanations remain an underdeveloped object of systematic analysis in political science, and existing approaches are fragmented and often issue-specific. I introduce a framework for detecting and parsing explanations in political text. To do this, I train a lightweight causal language model that returns a structured data set of causal claims in the form of cause-effect pairs for downstream analysis. I demonstrate how causal explanations can be studied at scale, and show the method's modest annotation requirements, generalizability, and accuracy relative to human coding.

Identifying attributions of causality in political text

TL;DR

This paper presents a scalable framework for extracting and analyzing causal attributions in political text by fine-tuning causal language models for sequence classification and span detection to produce structured cause–effect representations. It leverages a PolitiCAUSE-inspired semantic annotation scheme and span-reconstruction to create interpretable data, validated on armed-conflict headlines from Al Jazeera, BBC, and CNN. The empirical analysis uses a log-odds framing approach with Dirichlet priors to compare how actors are framed across conflicts and outlets, revealing stable Russia-as-cause framing in Eastern Europe and outlet-specific patterns in the Middle East. The work demonstrates how causal explanations—viewed as non-neutral rhetorical devices—can be measured at scale to illuminate attribution, responsibility, and framing in political discourse, with broad methodological implications for political science and communication research.

Abstract

Explanations are a fundamental element of how people make sense of the political world. Citizens routinely ask and answer questions about why events happen, who is responsible, and what could or should be done differently. Yet despite their importance, explanations remain an underdeveloped object of systematic analysis in political science, and existing approaches are fragmented and often issue-specific. I introduce a framework for detecting and parsing explanations in political text. To do this, I train a lightweight causal language model that returns a structured data set of causal claims in the form of cause-effect pairs for downstream analysis. I demonstrate how causal explanations can be studied at scale, and show the method's modest annotation requirements, generalizability, and accuracy relative to human coding.

Paper Structure

This paper contains 41 sections, 11 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: From the sentence "The mayor's policies led to increased housing costs" we can identify the cause and effect and represent it in graph form.
  • Figure 2: Competing actors frequently offer alternative causal accounts of the same event. Using causal extraction, we can parse a corpus, and extract the causal attributions made around specific issues of interest.
  • Figure 3: The first sentence is labeled as "causal" because the cause and the effect are explicitly expressed in the sentence. The second sentence is labeled as "not causal", because there is no explicit cause expressed within the sentence.
  • Figure 4: To fine-tune a sequence classification model, a sequence is tokenized into subtokens and the [CLS] and [SEP] tokens are added. The embeddings are pushed through a hidden state. A classification model determines if the sequence is positive or negative by predicting the logits of the sequence embeddings in the [CLS] token.
  • Figure 5: To fine-tune a span detection model a sequence is tokenized and the [CLS] and [SEP] tokens are added. A BERT token classification model assigns a label to each subtoken by feeding their embeddings into a classifier, selecting the label with the highest probability. Spans are then extracted using the IOB2 labeling scheme, where contiguous tokens labeled with 'B-' (begin) and 'I-' (inside) tags form identified spans.
  • ...and 7 more figures