Official-NV: An LLM-Generated News Video Dataset for Multimodal Fake News Detection
Yihao Wang, Lizhi Chen, Zhong Qian, Peifeng Li
TL;DR
Official-NV addresses noise in multimodal fake news detection by curating officially published news videos and augmenting them with LLM-generated text and frame alterations to produce a balanced, 10,000-video dataset. The authors propose OFNVD, a baseline that uses GLU attention to extract salient text and frame features and a cross-modal Transformer to fuse modalities for detection. Through extensive experiments, they demonstrate OFNVD's strong performance, the value of title information, and the robustness of Official-NV across data augmentation strategies and imbalanced conditions. This work presents a practical, high-quality resource and a competitive baseline to advance multimodal fake news detection research in video domains.
Abstract
News media, especially video news media, have penetrated into every aspect of daily life, which also brings the risk of fake news. Therefore, multimodal fake news detection has recently garnered increased attention. However, the existing datasets are comprised of user-uploaded videos and contain an excess amounts of superfluous data, which introduces noise into the model training process. To address this issue, we construct a dataset named Official-NV, comprising officially published news videos. The crawl officially published videos are augmented through the use of LLMs-based generation and manual verification, thereby expanding the dataset. We also propose a new baseline model called OFNVD, which captures key information from multimodal features through a GLU attention mechanism and performs feature enhancement and modal aggregation via a cross-modal Transformer. Benchmarking the dataset and baselines demonstrates the effectiveness of our model in multimodal news detection.
