Uncovering Agendas: A Novel French & English Dataset for Agenda Detection on Social Media

Gregorios Katsios; Ning Sa; Ankita Bhaumik; Tomek Strzalkowski

Uncovering Agendas: A Novel French & English Dataset for Agenda Detection on Social Media

Gregorios Katsios, Ning Sa, Ankita Bhaumik, Tomek Strzalkowski

TL;DR

This work tackles agenda-detection on social media with limited labeled data by reframing the task as textual entailment. It pre-trains models on standard NLI datasets and fine-tunes them on a bilingual agenda dataset derived from tweets about the 2022 French elections, generating hypotheses from agenda labels and interpreting entailment signals as multi-label outputs. The study demonstrates that textual-entailment approaches, especially multilingual T5-based models with RTE pre-training, outperform traditional classification and zero-shot baselines, achieving robust results in a low-resource, multilingual setting. The dataset and code are released to enable further research, supporting rapid detection of emergent influence campaigns across languages and media.

Abstract

The behavior and decision making of groups or communities can be dramatically influenced by individuals pushing particular agendas, e.g., to promote or disparage a person or an activity, to call for action, etc.. In the examination of online influence campaigns, particularly those related to important political and social events, scholars often concentrate on identifying the sources responsible for setting and controlling the agenda (e.g., public media). In this article we present a methodology for detecting specific instances of agenda control through social media where annotated data is limited or non-existent. By using a modest corpus of Twitter messages centered on the 2022 French Presidential Elections, we carry out a comprehensive evaluation of various approaches and techniques that can be applied to this problem. Our findings demonstrate that by treating the task as a textual entailment problem, it is possible to overcome the requirement for a large annotated training dataset.

Uncovering Agendas: A Novel French & English Dataset for Agenda Detection on Social Media

TL;DR

Abstract

Paper Structure (30 sections, 2 figures, 8 tables)

This paper contains 30 sections, 2 figures, 8 tables.

Introduction
Related Work
Agenda Detection
Textual Entailment Text Classification
Data
Pre-training Data
Fine-tuning Data
Method
Models
Pre-training
Fine-tuning
Generating Hypotheses from Agenda Labels
Interpreting Agenda Predictions from Textual Entailment
Experimental Set-up & Results
Zero-shot Agenda Detection
...and 15 more sections

Figures (2)

Figure 1: Confusion matrix of agenda-rte-bi-mT5 French results. The extra labels are in the bottom row and the missed labels are in the rightmost column.
Figure 2: Confusion matrix for agenda-rte-bi-mT5 French results on Run 1. The extra labels are in the bottom row and the missed labels are in the rightmost column.

Uncovering Agendas: A Novel French & English Dataset for Agenda Detection on Social Media

TL;DR

Abstract

Uncovering Agendas: A Novel French & English Dataset for Agenda Detection on Social Media

Authors

TL;DR

Abstract

Table of Contents

Figures (2)