Table of Contents
Fetching ...

VLDBench Evaluating Multimodal Disinformation with Regulatory Alignment

Shaina Raza, Ashmal Vayani, Aditya Jain, Aravind Narayanan, Vahid Reza Khazaie, Syed Raza Bashir, Elham Dolatabadi, Gias Uddin, Christos Emmanouilidis, Rizwan Qureshi, Mubarak Shah

TL;DR

VLDBench introduces the first governance-aligned, large-scale multimodal disinformation benchmark that jointly evaluates text-only and text–image detection. It combines a semi-automatic annotation pipeline with expert verification to produce 62,678 labeled instances across 13 categories from 58 outlets, and provides open data, evaluation code, and a risk-score framework tied to AI governance. Empirical results show vision–language models generally outperform unimodal baselines, with instruction fine-tuning and larger scales further boosting performance, though cross-modal perturbations pose notable vulnerabilities. The benchmark also integrates governance considerations, mapping to MIT's AI Risk Repository and offering a risk scorecard to support responsible AI deployment and policy analysis. Together, VLDBench delivers a robust, reproducible resource for advancing trustworthy multimodal disinformation detection and governance-aligned evaluation.

Abstract

Detecting disinformation that blends manipulated text and images has become increasingly challenging, as AI tools make synthetic content easy to generate and disseminate. While most existing AI safety benchmarks focus on single modality misinformation (i.e., false content shared without intent to deceive), intentional multimodal disinformation, such as propaganda or conspiracy theories that imitate credible news, remains largely unaddressed. We introduce the Vision-Language Disinformation Detection Benchmark (VLDBench), the first large-scale resource supporting both unimodal (text-only) and multimodal (text + image) disinformation detection. VLDBench comprises approximately 62,000 labeled text-image pairs across 13 categories, curated from 58 news outlets. Using a semi-automated pipeline followed by expert review, 22 domain experts invested over 500 hours to produce high-quality annotations with substantial inter-annotator agreement. Evaluations of state-of-the-art Large Language Models (LLMs) and Vision-Language Models (VLMs) on VLDBench show that incorporating visual cues improves detection accuracy by 5 to 35 percentage points over text-only models. VLDBench provides data and code for evaluation, fine-tuning, and robustness testing to support disinformation analysis. Developed in alignment with AI governance frameworks (e.g., the MIT AI Risk Repository), VLDBench offers a principled foundation for advancing trustworthy disinformation detection in multimodal media. Project: https://vectorinstitute.github.io/VLDBench/ Dataset: https://huggingface.co/datasets/vector-institute/VLDBench Code: https://github.com/VectorInstitute/VLDBench

VLDBench Evaluating Multimodal Disinformation with Regulatory Alignment

TL;DR

VLDBench introduces the first governance-aligned, large-scale multimodal disinformation benchmark that jointly evaluates text-only and text–image detection. It combines a semi-automatic annotation pipeline with expert verification to produce 62,678 labeled instances across 13 categories from 58 outlets, and provides open data, evaluation code, and a risk-score framework tied to AI governance. Empirical results show vision–language models generally outperform unimodal baselines, with instruction fine-tuning and larger scales further boosting performance, though cross-modal perturbations pose notable vulnerabilities. The benchmark also integrates governance considerations, mapping to MIT's AI Risk Repository and offering a risk scorecard to support responsible AI deployment and policy analysis. Together, VLDBench delivers a robust, reproducible resource for advancing trustworthy multimodal disinformation detection and governance-aligned evaluation.

Abstract

Detecting disinformation that blends manipulated text and images has become increasingly challenging, as AI tools make synthetic content easy to generate and disseminate. While most existing AI safety benchmarks focus on single modality misinformation (i.e., false content shared without intent to deceive), intentional multimodal disinformation, such as propaganda or conspiracy theories that imitate credible news, remains largely unaddressed. We introduce the Vision-Language Disinformation Detection Benchmark (VLDBench), the first large-scale resource supporting both unimodal (text-only) and multimodal (text + image) disinformation detection. VLDBench comprises approximately 62,000 labeled text-image pairs across 13 categories, curated from 58 news outlets. Using a semi-automated pipeline followed by expert review, 22 domain experts invested over 500 hours to produce high-quality annotations with substantial inter-annotator agreement. Evaluations of state-of-the-art Large Language Models (LLMs) and Vision-Language Models (VLMs) on VLDBench show that incorporating visual cues improves detection accuracy by 5 to 35 percentage points over text-only models. VLDBench provides data and code for evaluation, fine-tuning, and robustness testing to support disinformation analysis. Developed in alignment with AI governance frameworks (e.g., the MIT AI Risk Repository), VLDBench offers a principled foundation for advancing trustworthy disinformation detection in multimodal media. Project: https://vectorinstitute.github.io/VLDBench/ Dataset: https://huggingface.co/datasets/vector-institute/VLDBench Code: https://github.com/VectorInstitute/VLDBench

Paper Structure

This paper contains 50 sections, 19 figures, 24 tables.

Figures (19)

  • Figure 1: Schematic illustration of the disinformation flow: fabricated narratives and multimodal manipulation distort real-world information, producing myths and deliberate falsehoods and ultimately eroding public trust.
  • Figure 2: Disinformation example. This instance spreads likely disinformation by generating political hype without factual basis.
  • Figure 3: Representative example: left = image; right = concise rule and evidence.
  • Figure 4: VLDBench Framework: The system comprises the stages: (1) Define Task : formalizing the detection objective; (2) Data Pipeline : curating and preprocessing real-world multimodal news content; (3) Annotation Pipeline : generating labels via LLM-assisted for scale and; (4) Human Review : validating annotations through expert oversight; (5) Benchmarking : evaluating models for accuracy, reasoning, and risk mitigation across prompting, fine-tuning, and robustness scenarios.
  • Figure 5: Annotation pipeline used in VLDBench.
  • ...and 14 more figures