Validating Political Position Predictions of Arguments

Jordan Robinson; Angus R. Williams; Katie Atkinson; Anthony G. Cohn

Validating Political Position Predictions of Arguments

Jordan Robinson, Angus R. Williams, Katie Atkinson, Anthony G. Cohn

Abstract

Real-world knowledge representation often requires capturing subjective, continuous attributes -- such as political positions -- that conflict with pairwise validation, the widely accepted gold standard for human evaluation. We address this challenge through a dual-scale validation framework applied to political stance prediction in argumentative discourse, combining pointwise and pairwise human annotation. Using 22 language models, we construct a large-scale knowledge base of political position predictions for 23,228 arguments drawn from 30 debates that appeared on the UK politicial television programme \textit{Question Time}. Pointwise evaluation shows moderate human-model agreement (Krippendorff's $α=0.578$), reflecting intrinsic subjectivity, while pairwise validation reveals substantially stronger alignment between human- and model-derived rankings ($α=0.86$ for the best model). This work contributes: (i) a practical validation methodology for subjective continuous knowledge that balances scalability with reliability; (ii) a validated structured argumentation knowledge base enabling graph-based reasoning and retrieval-augmented generation in political domains; and (iii) evidence that ordinal structure can be extracted from pointwise language models predictions from inherently subjective real-world discourse, advancing knowledge representation capabilities for domains where traditional symbolic or categorical approaches are insufficient.

Validating Political Position Predictions of Arguments

Abstract

), reflecting intrinsic subjectivity, while pairwise validation reveals substantially stronger alignment between human- and model-derived rankings (

for the best model). This work contributes: (i) a practical validation methodology for subjective continuous knowledge that balances scalability with reliability; (ii) a validated structured argumentation knowledge base enabling graph-based reasoning and retrieval-augmented generation in political domains; and (iii) evidence that ordinal structure can be extracted from pointwise language models predictions from inherently subjective real-world discourse, advancing knowledge representation capabilities for domains where traditional symbolic or categorical approaches are insufficient.

Paper Structure (42 sections, 5 equations, 6 figures, 6 tables)

This paper contains 42 sections, 5 equations, 6 figures, 6 tables.

Introduction
Related Work
LLMs as Evaluators of Language Outputs.
Political Position Prediction with LLMs.
Pairwise Comparison and Preference-Based Evaluation.
Political Argumentation Resources.
Knowledge Base Construction
Data Sources and Preprocessing
LLMs as Judges of Political Position
Ensemble Construction and Aggregation
Ensemble 1 [$E_1$]: All ($n=22$).
Ensemble 2 [$E_2$]: Reasoning Models ($n=5$).
Ensemble 3 [$E_3$]: High-Confidence Models ($n=12$).
Human Annotation and Validation Design
Pointwise Binary Classification of Political Sentiment
...and 27 more sections

Figures (6)

Figure 1: Overview of the methodology used to instantiate a structured argumentative knowledge base containing political positions predictions for all arguments. Green boxes indicate components developed specifically for this study.
Figure 2: Nominal Krippendorff's $\alpha_n$ measuring inter-model agreement across data partitions.
Figure 3: Distribution of human--model agreement across dataset partitions.
Figure 4: Relationship between human--model agreement and model performance metrics.
Figure 5: Macro F1 plotted against ordinal human--model agreement for all models and ensembles under $\mathcal{P}$.
...and 1 more figures

Validating Political Position Predictions of Arguments

Abstract

Validating Political Position Predictions of Arguments

Authors

Abstract

Table of Contents

Figures (6)