Table of Contents
Fetching ...

Mapping the political landscape from data traces: multidimensional opinions of users, politicians and media outlets on X

Antoine Vendeville, Jimena Royo-Letelier, Duncan Cassells, Jean-Philippe Cointet, Maxime Crépel, Tim Faverjon, Théophile Lenoir, Béatrice Mazoyer, Benjamin Ooghe-Tabanou, Armin Pournaki, Hiroki Yamashita, Pedro Ramaciotti

TL;DR

This work creates the first public, multidimensional, continuous opinion dataset for almost $N=978{,}933$ X users, 883 French MPs, and hundreds of media domains, calibrated across 16 CHES dimensions. It derives latent positions from the MP–follower follow network via Correspondence Analysis and maps them onto CHES axes using an affine transformation with Ridge regularization, yielding interpretable ideological and issue coordinates $\phi \in \mathbb{R}^P$ and $P$ up to 883. Validation combines bios-based LLM and human annotations to demonstrate monotone concentration and clear separation along CHES dimensions, and cross-wave consistency between CHES 2019 and 2023. The dataset enables nuanced analyses of polarization, media influence, and political space dynamics beyond traditional Left-Right, supporting cross-country and cross-platform extensions while acknowledging platform-specific biases and sampling limitations. Overall, the resource provides a robust, scalable framework for studying multidimensional online political ecosystems with practical implications for understanding online debates and media effects.

Abstract

Studying political activity on social media often requires defining and measuring political stances of users or content. Relevant examples include the study of opinion polarization, or the study of political diversity in online content diets. While many research designs rely on operationalizations best suited for the US setting, few allow addressing more general political systems, in which users and media outlets might exhibit stances on multiple ideology and issue dimensions, going beyond traditional Liberal-Conservative or Left-Right scales. To advance the study of more general online ecosystems, we present a dataset pertaining to a population of X/Twitter users, parliamentarians, and media outlets embedded in a political space spanned by dimensions measuring attitudes towards immigration, the EU, liberal values, elites and institutions, nationalism and the environment, in addition to left-right and liberal-conservative scales. We include indicators of individual activity and popularity: mean number of posts per day, number of followers, and number of followees. We provide several benchmarks validating the positions of these entities and discuss several applications for this dataset.

Mapping the political landscape from data traces: multidimensional opinions of users, politicians and media outlets on X

TL;DR

This work creates the first public, multidimensional, continuous opinion dataset for almost X users, 883 French MPs, and hundreds of media domains, calibrated across 16 CHES dimensions. It derives latent positions from the MP–follower follow network via Correspondence Analysis and maps them onto CHES axes using an affine transformation with Ridge regularization, yielding interpretable ideological and issue coordinates and up to 883. Validation combines bios-based LLM and human annotations to demonstrate monotone concentration and clear separation along CHES dimensions, and cross-wave consistency between CHES 2019 and 2023. The dataset enables nuanced analyses of polarization, media influence, and political space dynamics beyond traditional Left-Right, supporting cross-country and cross-platform extensions while acknowledging platform-specific biases and sampling limitations. Overall, the resource provides a robust, scalable framework for studying multidimensional online political ecosystems with practical implications for understanding online debates and media effects.

Abstract

Studying political activity on social media often requires defining and measuring political stances of users or content. Relevant examples include the study of opinion polarization, or the study of political diversity in online content diets. While many research designs rely on operationalizations best suited for the US setting, few allow addressing more general political systems, in which users and media outlets might exhibit stances on multiple ideology and issue dimensions, going beyond traditional Liberal-Conservative or Left-Right scales. To advance the study of more general online ecosystems, we present a dataset pertaining to a population of X/Twitter users, parliamentarians, and media outlets embedded in a political space spanned by dimensions measuring attitudes towards immigration, the EU, liberal values, elites and institutions, nationalism and the environment, in addition to left-right and liberal-conservative scales. We include indicators of individual activity and popularity: mean number of posts per day, number of followers, and number of followees. We provide several benchmarks validating the positions of these entities and discuss several applications for this dataset.
Paper Structure (50 sections, 1 equation, 5 figures, 10 tables)

This paper contains 50 sections, 1 equation, 5 figures, 10 tables.

Figures (5)

  • Figure 1: Positioning of users, MPs, parties and three selected media domains on four political axes: economic Left-Right and anti-elite sentiment (left), European integration and ecology (right). Colored crosses represent MPs. Colored rectangles indicate the average position of each party's parliamentarians. Marginal densities and hexagons indicate the distribution of followers positions. White dots indicate the positions of three selected media domains.
  • Figure 2: Proportion of users having a given label indicating ideological or issue stances, as provided by both human and LLM annotations, along the corresponding CHES dimensions they intend to validate. Vertical bars delimit Clopper-Pearson confidence intervals clopper1934theuse at the $\alpha=0.05$.
  • Figure 3: Illustration of the logistic regression model for the economical left-right dimension (CHES 2019), and the anti-elite sentiment dimension (CHES 2023). We show results obtained with human as well as with LLM annotations. Blue and red areas indicate the distributions of the political position of the annotated users. The cutoff determines where the model changes its prediction. We also indicate ROC AUC and F1 scores.
  • Figure 4: Positions of followers, MPs and party centroids along the four dimensions present in the 2019 and 2023 waves of CHES surveys. Blue dots are followers, yellow dots are MPs, colored squares represent party positions in the CHES surveys. We show the $y=x$ line in black. We also indicate Pearson correlations between the positions of users on the two axes.
  • Figure 5: Validation of the media positions along two political axes. We show the distribution of positions among each media category. These figures demonstrate a strong alignment between the categorization of media along the left-right axis and the economical Left-Right dimension as well as the Environment dimension. The latter is consistent with the difference in environmental discourse between the left and the right in France.