Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

Lina Duaibes; Areej Jaber; Mustafa Jarrar; Ahmad Qadi; Mais Qandeel

Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

Lina Duaibes, Areej Jaber, Mustafa Jarrar, Ahmad Qadi, Mais Qandeel

TL;DR

The paper addresses bias and propaganda detection in multilingual social media by introducing a $12{,}000$-post Facebook corpus annotated for bias and propaganda across five languages. It adopts a detailed taxonomy for bias (including explicit/implicit/vague types) and propaganda (with deletion flags) and evaluates annotation quality via Inter-Annotator Agreement, reporting $0.808$ for bias and $0.7015$ for propaganda. The study demonstrates strong human-annotation performance, describes a rigorous two-phase annotation process, and reveals language-specific patterns (notably French contributions to both bias and propaganda, with Hebrew showing more explicit bias). It concludes with implications for expanding the corpus and for applying neural/LLM approaches to automatic bias and propaganda detection in cross-language social-media data.

Abstract

The proliferation of bias and propaganda on social media is an increasingly significant concern, leading to the development of techniques for automatic detection. This article presents a multilingual corpus of 12, 000 Facebook posts fully annotated for bias and propaganda. The corpus was created as part of the FigNews 2024 Shared Task on News Media Narratives for framing the Israeli War on Gaza. It covers various events during the War from October 7, 2023 to January 31, 2024. The corpus comprises 12, 000 posts in five languages (Arabic, Hebrew, English, French, and Hindi), with 2, 400 posts for each language. The annotation process involved 10 graduate students specializing in Law. The Inter-Annotator Agreement (IAA) was used to evaluate the annotations of the corpus, with an average IAA of 80.8% for bias and 70.15% for propaganda annotations. Our team was ranked among the bestperforming teams in both Bias and Propaganda subtasks. The corpus is open-source and available at https://sina.birzeit.edu/fada

Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

TL;DR

The paper addresses bias and propaganda detection in multilingual social media by introducing a

-post Facebook corpus annotated for bias and propaganda across five languages. It adopts a detailed taxonomy for bias (including explicit/implicit/vague types) and propaganda (with deletion flags) and evaluates annotation quality via Inter-Annotator Agreement, reporting

for bias and

for propaganda. The study demonstrates strong human-annotation performance, describes a rigorous two-phase annotation process, and reveals language-specific patterns (notably French contributions to both bias and propaganda, with Hebrew showing more explicit bias). It concludes with implications for expanding the corpus and for applying neural/LLM approaches to automatic bias and propaganda detection in cross-language social-media data.

Abstract

Paper Structure (10 sections, 6 tables)

This paper contains 10 sections, 6 tables.

Introduction
Annotation Methodology
Annotation Guidelines
Inter-Annotator Agreement (IAA)
Team Composition and Training
Annotation process
Task Participation and Results
Results
Error Analysis and Discussion
Conclusion

Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

TL;DR

Abstract

Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

Authors

TL;DR

Abstract

Table of Contents