FMNV: A Dataset of Media-Published News Videos for Fake News Detection

Yihao Wang; Zhong Qian; Peifeng Li

FMNV: A Dataset of Media-Published News Videos for Fake News Detection

Yihao Wang, Zhong Qian, Peifeng Li

TL;DR

The paper targets the problem of detecting fake news in professionally produced news videos, a domain where misinformation can be particularly impactful. It introduces FMNV, a dataset of 2,393 media-sourced videos (1,500 fake, 893 real) augmented with 1,500 LLM-generated fakes across four manipulation types, enabling robust cross-modal analysis. The proposed FMNVD baseline fuses motion (3D ResNeXt-101) and static visual semantics (CLIP) with hierarchical co-attention to integrate title and audio transcripts, achieving state-of-the-art performance (0.7417 accuracy) on FMNV and demonstrating generalization across manipulation types. Overall, the work provides critical benchmarks and a methodological framework for cross-modal inconsistency analysis in high-impact fake news scenarios, with public availability of the FMNV dataset for driving future MFND research.

Abstract

News media, particularly video-based platforms, have become deeply embed-ded in daily life, concurrently amplifying the risks of misinformation dissem-ination. Consequently, multimodal fake news detection has garnered signifi-cant research attention. However, existing datasets predominantly comprise user-generated videos characterized by crude editing and limited public en-gagement, whereas professionally crafted fake news videos disseminated by media outlets-often politically or virally motivated-pose substantially greater societal harm. To address this gap, we construct FMNV, a novel da-taset exclusively composed of news videos published by media organizations. Through empirical analysis of existing datasets and our curated collection, we categorize fake news videos into four distinct types. Building upon this taxonomy, we employ Large Language Models (LLMs) to automatically generate deceptive content by manipulating authentic media-published news videos. Furthermore, we propose FMNVD, a baseline model featuring a dual-stream architecture that integrates spatio-temporal motion features from a 3D ResNeXt-101 backbone and static visual semantics from CLIP. The two streams are fused via an attention-based mechanism, while co-attention modules refine the visual, textual, and audio features for effective multi-modal aggregation. Comparative experiments demonstrate both the generali-zation capability of FMNV across multiple baselines and the superior detec-tion efficacy of FMNVD. This work establishes critical benchmarks for de-tecting high-impact fake news in media ecosystems while advancing meth-odologies for cross-modal inconsistency analysis. Our dataset is available in https://github.com/DennisIW/FMNV.

FMNV: A Dataset of Media-Published News Videos for Fake News Detection

TL;DR

Abstract

FMNV: A Dataset of Media-Published News Videos for Fake News Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)