Table of Contents
Fetching ...

Experimenting AI Technologies for Disinformation Combat: the IDMO Project

Lorenzo Canale, Alberto Messina

TL;DR

The paper addresses disinformation detection within the IDMO framework by building multilingual datasets and evaluating automated verdict categorization and textual entailment. It introduces novel Italian datasets (RAI-CMM-OPEN series and Pagella Politica) alongside FEVER-based multilingual variants, and demonstrates strong performance for DistilBERT in RTE tasks with cross-language generalization. It also examines retrieval-based debunking (Statement-Document Similarity and Content Treatment Detection) and explores the role of GPT-4 in content clarity and treatment analysis, complemented by a serious game (FAKE RADAR) to raise public awareness. Collectively, the work advances automated disinformation analysis, data annotation workflows, and public-facing tools, with clear directions for future integration of LangChain-based retrieval, fine-tuning of GPT models, and ongoing game-based engagement.

Abstract

The Italian Digital Media Observatory (IDMO) project, part of a European initiative, focuses on countering disinformation and fake news. This report outlines contributions from Rai-CRITS to the project, including: (i) the creation of novel datasets for testing technologies (ii) development of an automatic model for categorizing Pagella Politica verdicts to facilitate broader analysis (iii) creation of an automatic model for recognizing textual entailment with exceptional accuracy on the FEVER dataset (iv) assessment using GPT-4 to detecting content treatment style (v) a game to raise awareness about fake news at national events.

Experimenting AI Technologies for Disinformation Combat: the IDMO Project

TL;DR

The paper addresses disinformation detection within the IDMO framework by building multilingual datasets and evaluating automated verdict categorization and textual entailment. It introduces novel Italian datasets (RAI-CMM-OPEN series and Pagella Politica) alongside FEVER-based multilingual variants, and demonstrates strong performance for DistilBERT in RTE tasks with cross-language generalization. It also examines retrieval-based debunking (Statement-Document Similarity and Content Treatment Detection) and explores the role of GPT-4 in content clarity and treatment analysis, complemented by a serious game (FAKE RADAR) to raise public awareness. Collectively, the work advances automated disinformation analysis, data annotation workflows, and public-facing tools, with clear directions for future integration of LangChain-based retrieval, fine-tuning of GPT models, and ongoing game-based engagement.

Abstract

The Italian Digital Media Observatory (IDMO) project, part of a European initiative, focuses on countering disinformation and fake news. This report outlines contributions from Rai-CRITS to the project, including: (i) the creation of novel datasets for testing technologies (ii) development of an automatic model for categorizing Pagella Politica verdicts to facilitate broader analysis (iii) creation of an automatic model for recognizing textual entailment with exceptional accuracy on the FEVER dataset (iv) assessment using GPT-4 to detecting content treatment style (v) a game to raise awareness about fake news at national events.
Paper Structure (20 sections, 5 figures, 10 tables)

This paper contains 20 sections, 5 figures, 10 tables.

Figures (5)

  • Figure 1: Screenshot depicting the annotation of the PagellaPolitica2 dataset conducted through a Telegram bot for ease of use
  • Figure 2: Distribution of categories by information source
  • Figure 3: Distribution of categories by party
  • Figure 4: Accuracy vs. Number of Training Samples
  • Figure 5: Confusion Matrices between true labels and labels predicted by GPT-4 for Content Treatment Detection on the RAI-CMM-OPEN2 dataset