Table of Contents
Fetching ...

Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

Lina Duaibes, Areej Jaber, Mustafa Jarrar, Ahmad Qadi, Mais Qandeel

TL;DR

The paper addresses bias and propaganda detection in multilingual social media by introducing a $12{,}000$-post Facebook corpus annotated for bias and propaganda across five languages. It adopts a detailed taxonomy for bias (including explicit/implicit/vague types) and propaganda (with deletion flags) and evaluates annotation quality via Inter-Annotator Agreement, reporting $0.808$ for bias and $0.7015$ for propaganda. The study demonstrates strong human-annotation performance, describes a rigorous two-phase annotation process, and reveals language-specific patterns (notably French contributions to both bias and propaganda, with Hebrew showing more explicit bias). It concludes with implications for expanding the corpus and for applying neural/LLM approaches to automatic bias and propaganda detection in cross-language social-media data.

Abstract

The proliferation of bias and propaganda on social media is an increasingly significant concern, leading to the development of techniques for automatic detection. This article presents a multilingual corpus of 12, 000 Facebook posts fully annotated for bias and propaganda. The corpus was created as part of the FigNews 2024 Shared Task on News Media Narratives for framing the Israeli War on Gaza. It covers various events during the War from October 7, 2023 to January 31, 2024. The corpus comprises 12, 000 posts in five languages (Arabic, Hebrew, English, French, and Hindi), with 2, 400 posts for each language. The annotation process involved 10 graduate students specializing in Law. The Inter-Annotator Agreement (IAA) was used to evaluate the annotations of the corpus, with an average IAA of 80.8% for bias and 70.15% for propaganda annotations. Our team was ranked among the bestperforming teams in both Bias and Propaganda subtasks. The corpus is open-source and available at https://sina.birzeit.edu/fada

Sina at FigNews 2024: Multilingual Datasets Annotated with Bias and Propaganda

TL;DR

The paper addresses bias and propaganda detection in multilingual social media by introducing a -post Facebook corpus annotated for bias and propaganda across five languages. It adopts a detailed taxonomy for bias (including explicit/implicit/vague types) and propaganda (with deletion flags) and evaluates annotation quality via Inter-Annotator Agreement, reporting for bias and for propaganda. The study demonstrates strong human-annotation performance, describes a rigorous two-phase annotation process, and reveals language-specific patterns (notably French contributions to both bias and propaganda, with Hebrew showing more explicit bias). It concludes with implications for expanding the corpus and for applying neural/LLM approaches to automatic bias and propaganda detection in cross-language social-media data.

Abstract

The proliferation of bias and propaganda on social media is an increasingly significant concern, leading to the development of techniques for automatic detection. This article presents a multilingual corpus of 12, 000 Facebook posts fully annotated for bias and propaganda. The corpus was created as part of the FigNews 2024 Shared Task on News Media Narratives for framing the Israeli War on Gaza. It covers various events during the War from October 7, 2023 to January 31, 2024. The corpus comprises 12, 000 posts in five languages (Arabic, Hebrew, English, French, and Hindi), with 2, 400 posts for each language. The annotation process involved 10 graduate students specializing in Law. The Inter-Annotator Agreement (IAA) was used to evaluate the annotations of the corpus, with an average IAA of 80.8% for bias and 70.15% for propaganda annotations. Our team was ranked among the bestperforming teams in both Bias and Propaganda subtasks. The corpus is open-source and available at https://sina.birzeit.edu/fada
Paper Structure (10 sections, 6 tables)