Experimenting AI Technologies for Disinformation Combat: the IDMO Project

Lorenzo Canale; Alberto Messina

Experimenting AI Technologies for Disinformation Combat: the IDMO Project

Lorenzo Canale, Alberto Messina

TL;DR

The paper addresses disinformation detection within the IDMO framework by building multilingual datasets and evaluating automated verdict categorization and textual entailment. It introduces novel Italian datasets (RAI-CMM-OPEN series and Pagella Politica) alongside FEVER-based multilingual variants, and demonstrates strong performance for DistilBERT in RTE tasks with cross-language generalization. It also examines retrieval-based debunking (Statement-Document Similarity and Content Treatment Detection) and explores the role of GPT-4 in content clarity and treatment analysis, complemented by a serious game (FAKE RADAR) to raise public awareness. Collectively, the work advances automated disinformation analysis, data annotation workflows, and public-facing tools, with clear directions for future integration of LangChain-based retrieval, fine-tuning of GPT models, and ongoing game-based engagement.

Abstract

The Italian Digital Media Observatory (IDMO) project, part of a European initiative, focuses on countering disinformation and fake news. This report outlines contributions from Rai-CRITS to the project, including: (i) the creation of novel datasets for testing technologies (ii) development of an automatic model for categorizing Pagella Politica verdicts to facilitate broader analysis (iii) creation of an automatic model for recognizing textual entailment with exceptional accuracy on the FEVER dataset (iv) assessment using GPT-4 to detecting content treatment style (v) a game to raise awareness about fake news at national events.

Experimenting AI Technologies for Disinformation Combat: the IDMO Project

TL;DR

Abstract

Paper Structure (20 sections, 5 figures, 10 tables)

This paper contains 20 sections, 5 figures, 10 tables.

Introduction
Dataset Creation and Annotation
Manual Annotatated Datasets
Datasets from RAI-CMM and OPEN
Datasets from Pagella Politica
Additional Datasets
Impact of Disinformation Spread through Media
Twitter Impact Analysis
Politician Statements and Media Influence
Usage of Pagella Politica verdicts for further analysis
Automatically categorize Pagella Politica fact-checking verdicts
Document retrieval for debunking
Statement-Document Similarity Analysis
Recognizing Textual Entailment
DistilBERT fine-tuning on FEVER dataset
...and 5 more sections

Figures (5)

Figure 1: Screenshot depicting the annotation of the PagellaPolitica2 dataset conducted through a Telegram bot for ease of use
Figure 2: Distribution of categories by information source
Figure 3: Distribution of categories by party
Figure 4: Accuracy vs. Number of Training Samples
Figure 5: Confusion Matrices between true labels and labels predicted by GPT-4 for Content Treatment Detection on the RAI-CMM-OPEN2 dataset

Experimenting AI Technologies for Disinformation Combat: the IDMO Project

TL;DR

Abstract

Experimenting AI Technologies for Disinformation Combat: the IDMO Project

Authors

TL;DR

Abstract

Table of Contents

Figures (5)