Table of Contents
Fetching ...

Official-NV: An LLM-Generated News Video Dataset for Multimodal Fake News Detection

Yihao Wang, Lizhi Chen, Zhong Qian, Peifeng Li

TL;DR

Official-NV addresses noise in multimodal fake news detection by curating officially published news videos and augmenting them with LLM-generated text and frame alterations to produce a balanced, 10,000-video dataset. The authors propose OFNVD, a baseline that uses GLU attention to extract salient text and frame features and a cross-modal Transformer to fuse modalities for detection. Through extensive experiments, they demonstrate OFNVD's strong performance, the value of title information, and the robustness of Official-NV across data augmentation strategies and imbalanced conditions. This work presents a practical, high-quality resource and a competitive baseline to advance multimodal fake news detection research in video domains.

Abstract

News media, especially video news media, have penetrated into every aspect of daily life, which also brings the risk of fake news. Therefore, multimodal fake news detection has recently garnered increased attention. However, the existing datasets are comprised of user-uploaded videos and contain an excess amounts of superfluous data, which introduces noise into the model training process. To address this issue, we construct a dataset named Official-NV, comprising officially published news videos. The crawl officially published videos are augmented through the use of LLMs-based generation and manual verification, thereby expanding the dataset. We also propose a new baseline model called OFNVD, which captures key information from multimodal features through a GLU attention mechanism and performs feature enhancement and modal aggregation via a cross-modal Transformer. Benchmarking the dataset and baselines demonstrates the effectiveness of our model in multimodal news detection.

Official-NV: An LLM-Generated News Video Dataset for Multimodal Fake News Detection

TL;DR

Official-NV addresses noise in multimodal fake news detection by curating officially published news videos and augmenting them with LLM-generated text and frame alterations to produce a balanced, 10,000-video dataset. The authors propose OFNVD, a baseline that uses GLU attention to extract salient text and frame features and a cross-modal Transformer to fuse modalities for detection. Through extensive experiments, they demonstrate OFNVD's strong performance, the value of title information, and the robustness of Official-NV across data augmentation strategies and imbalanced conditions. This work presents a practical, high-quality resource and a competitive baseline to advance multimodal fake news detection research in video domains.

Abstract

News media, especially video news media, have penetrated into every aspect of daily life, which also brings the risk of fake news. Therefore, multimodal fake news detection has recently garnered increased attention. However, the existing datasets are comprised of user-uploaded videos and contain an excess amounts of superfluous data, which introduces noise into the model training process. To address this issue, we construct a dataset named Official-NV, comprising officially published news videos. The crawl officially published videos are augmented through the use of LLMs-based generation and manual verification, thereby expanding the dataset. We also propose a new baseline model called OFNVD, which captures key information from multimodal features through a GLU attention mechanism and performs feature enhancement and modal aggregation via a cross-modal Transformer. Benchmarking the dataset and baselines demonstrates the effectiveness of our model in multimodal news detection.
Paper Structure (17 sections, 8 equations, 4 figures, 6 tables)

This paper contains 17 sections, 8 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Two examples of fake news videos. On the top the video content shows the crowds in Gaza are eager for peace, but the title is "Anticipating Continuation of Hostilities". On the bottom the title is "Japan quake", but the video is interspersed with content from the Syria quake.
  • Figure 2: Distributions of news videos
  • Figure 3: Overview of proposed OFNVD
  • Figure 4: A case in Official-NV demonstrates the role of GLU Attention