Table of Contents
Fetching ...

FMNV: A Dataset of Media-Published News Videos for Fake News Detection

Yihao Wang, Zhong Qian, Peifeng Li

TL;DR

The paper targets the problem of detecting fake news in professionally produced news videos, a domain where misinformation can be particularly impactful. It introduces FMNV, a dataset of 2,393 media-sourced videos (1,500 fake, 893 real) augmented with 1,500 LLM-generated fakes across four manipulation types, enabling robust cross-modal analysis. The proposed FMNVD baseline fuses motion (3D ResNeXt-101) and static visual semantics (CLIP) with hierarchical co-attention to integrate title and audio transcripts, achieving state-of-the-art performance (0.7417 accuracy) on FMNV and demonstrating generalization across manipulation types. Overall, the work provides critical benchmarks and a methodological framework for cross-modal inconsistency analysis in high-impact fake news scenarios, with public availability of the FMNV dataset for driving future MFND research.

Abstract

News media, particularly video-based platforms, have become deeply embed-ded in daily life, concurrently amplifying the risks of misinformation dissem-ination. Consequently, multimodal fake news detection has garnered signifi-cant research attention. However, existing datasets predominantly comprise user-generated videos characterized by crude editing and limited public en-gagement, whereas professionally crafted fake news videos disseminated by media outlets-often politically or virally motivated-pose substantially greater societal harm. To address this gap, we construct FMNV, a novel da-taset exclusively composed of news videos published by media organizations. Through empirical analysis of existing datasets and our curated collection, we categorize fake news videos into four distinct types. Building upon this taxonomy, we employ Large Language Models (LLMs) to automatically generate deceptive content by manipulating authentic media-published news videos. Furthermore, we propose FMNVD, a baseline model featuring a dual-stream architecture that integrates spatio-temporal motion features from a 3D ResNeXt-101 backbone and static visual semantics from CLIP. The two streams are fused via an attention-based mechanism, while co-attention modules refine the visual, textual, and audio features for effective multi-modal aggregation. Comparative experiments demonstrate both the generali-zation capability of FMNV across multiple baselines and the superior detec-tion efficacy of FMNVD. This work establishes critical benchmarks for de-tecting high-impact fake news in media ecosystems while advancing meth-odologies for cross-modal inconsistency analysis. Our dataset is available in https://github.com/DennisIW/FMNV.

FMNV: A Dataset of Media-Published News Videos for Fake News Detection

TL;DR

The paper targets the problem of detecting fake news in professionally produced news videos, a domain where misinformation can be particularly impactful. It introduces FMNV, a dataset of 2,393 media-sourced videos (1,500 fake, 893 real) augmented with 1,500 LLM-generated fakes across four manipulation types, enabling robust cross-modal analysis. The proposed FMNVD baseline fuses motion (3D ResNeXt-101) and static visual semantics (CLIP) with hierarchical co-attention to integrate title and audio transcripts, achieving state-of-the-art performance (0.7417 accuracy) on FMNV and demonstrating generalization across manipulation types. Overall, the work provides critical benchmarks and a methodological framework for cross-modal inconsistency analysis in high-impact fake news scenarios, with public availability of the FMNV dataset for driving future MFND research.

Abstract

News media, particularly video-based platforms, have become deeply embed-ded in daily life, concurrently amplifying the risks of misinformation dissem-ination. Consequently, multimodal fake news detection has garnered signifi-cant research attention. However, existing datasets predominantly comprise user-generated videos characterized by crude editing and limited public en-gagement, whereas professionally crafted fake news videos disseminated by media outlets-often politically or virally motivated-pose substantially greater societal harm. To address this gap, we construct FMNV, a novel da-taset exclusively composed of news videos published by media organizations. Through empirical analysis of existing datasets and our curated collection, we categorize fake news videos into four distinct types. Building upon this taxonomy, we employ Large Language Models (LLMs) to automatically generate deceptive content by manipulating authentic media-published news videos. Furthermore, we propose FMNVD, a baseline model featuring a dual-stream architecture that integrates spatio-temporal motion features from a 3D ResNeXt-101 backbone and static visual semantics from CLIP. The two streams are fused via an attention-based mechanism, while co-attention modules refine the visual, textual, and audio features for effective multi-modal aggregation. Comparative experiments demonstrate both the generali-zation capability of FMNV across multiple baselines and the superior detec-tion efficacy of FMNVD. This work establishes critical benchmarks for de-tecting high-impact fake news in media ecosystems while advancing meth-odologies for cross-modal inconsistency analysis. Our dataset is available in https://github.com/DennisIW/FMNV.

Paper Structure

This paper contains 22 sections, 9 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Example of Contextual Dishonesty: The video depicts children in Gaza longing for peace, while the title claims they anticipate continued conflict.
  • Figure 2: Example of Cherry-picked Editing: A fabricated claim that Musk is buying GM, stitched together from unrelated clips.
  • Figure 3: Example of Synthetic Voiceover: A party video overlaid with unrelated political commentary.
  • Figure 4: An example of Contrived Absurdity fake news video. The video shows a gigantic moon, and the title says it will disappear in 30 seconds, which clearly defies common sense.
  • Figure 5: Overview of proposed FMNVD.