Language Models Learn Metadata: Political Stance Detection Case Study

Stanley Cao; Felix Drinkall

Language Models Learn Metadata: Political Stance Detection Case Study

Stanley Cao, Felix Drinkall

TL;DR

The paper tackles political stance detection in parliamentary debates and analyzes how metadata should be integrated to predict a speaker's vote on a motion. It compares a metadata-focused Naive Bayes baseline, transformer-based fine-tuning, and two hybrid strategies, including prepending metadata as tokens and concatenating metadata-derived probabilities, plus GPT-4o prompts. Findings show that metadata-enhanced approaches outperform prior SOTA on ParlVote+, with prepending party and policy metadata to the input often providing the strongest gains, while the simple party-only Bayes already achieves strong performance; GPT-4o yields moderate gains. The results suggest that metadata can be a highly informative signal and that simpler, metadata-aware designs can surpass more complex architectures, with implications for metadata usage in NLP tasks beyond political discourse.

Abstract

Stance detection is a crucial NLP task with numerous applications in social science, from analyzing online discussions to assessing political campaigns. This paper investigates the optimal way to incorporate metadata into a political stance detection task. We demonstrate that previous methods combining metadata with language-based data for political stance detection have not fully utilized the metadata information; our simple baseline, using only party membership information, surpasses the current state-of-the-art. We then show that prepending metadata (e.g., party and policy) to political speeches performs best, outperforming all baselines, indicating that complex metadata inclusion systems may not learn the task optimally.

Language Models Learn Metadata: Political Stance Detection Case Study

TL;DR

Abstract

Paper Structure (28 sections, 2 equations, 5 figures, 2 tables)

This paper contains 28 sections, 2 equations, 5 figures, 2 tables.

Introduction
Related Work
Methodology
Data
Bayesian Models
Finetuning MPNet
Hybrid Models
Combining Bayesian Probabilities with Sentence Transformers
Prepending Metadata to Textual Data
GPT-4o
Results
Analysis
Conclusion
Motion and Speech Truncation
Metadata Incorporation into Transformers
...and 13 more sections

Figures (5)

Figure 1: Attention weights averaged among all heads for a particular speech in response to a Labour party member. Speaker is a Conservative party member. Speech: "I am sure that my hon friend the member for aldershot was about to cite an example that even the minister will remember : the war crimes bill."
Figure 2: Model Accuracy by Speaker Response Length / Word Count (with only party information)
Figure 3: Same speech as Figure \ref{['atten_party']}. The policy in question is "Constitutionalism: Negative."
Figure 4: Model Accuracy by Speaker Response Length / Word Count (with party and policy information)
Figure 5: Model Accuracy by Bayesian Prior Uncertainty $\left(|p - 0.5|\right)$ for different motioner party + speaker party pairs. To prevent the effects of outliers, we only include datapoints where the number of examples exceeds 50. We find the curve of best fit using Locally Weighted Scatterplot Smoothing (LOWESS)

Language Models Learn Metadata: Political Stance Detection Case Study

TL;DR

Abstract

Language Models Learn Metadata: Political Stance Detection Case Study

Authors

TL;DR

Abstract

Table of Contents

Figures (5)