A Detailed Factor Analysis for the Political Compass Test: Navigating Ideologies of Large Language Models
Sadia Kamal, Lalu Prasad Yadav Prakash, S M Rafiuddin, Mohammed Rakib, Atriya Sen, Sagnik Ray Choudhury
TL;DR
This paper analyzes whether the Political Compass Test (PCT) and 8 Values reliably measure political leanings in large language models, noting instability and prompt sensitivity. It screens four open-source LLMs and conducts controlled experiments on decoding parameters, prompting, and fine-tuning, analyzing $2693$ PCT results with ANOVA and t-tests; PCT axes range from $-10$ to $+10$. The main findings are that prompting has strong effects, standard decoding parameters have limited impact, and fine-tuning can shift scores in ways largely independent of the tuning data's political content, with results generalizable to 8 Values and across model sizes and quantization. The work underscores validity concerns for current bias benchmarks and motivates robust, mechanism-aware measures of political content encoding in LLMs.
Abstract
The Political Compass Test (PCT) and similar surveys are commonly used to assess political bias in auto-regressive LLMs. Our rigorous statistical experiments show that while changes to standard generation parameters have minimal effect on PCT scores, prompt phrasing and fine-tuning individually and together can significantly influence results. Interestingly, fine-tuning on politically rich vs. neutral datasets does not lead to different shifts in scores. We also generalize these findings to a similar popular test called 8 Values. Humans do not change their responses to questions when prompted differently (``answer this question'' vs ``state your opinion''), or after exposure to politically neutral text, such as mathematical formulae. But the fact that the models do so raises concerns about the validity of these tests for measuring model bias, and paves the way for deeper exploration into how political and social views are encoded in LLMs.
