Table of Contents
Fetching ...

Exploring Modality Disruption in Multimodal Fake News Detection

Moyang Liu, Kaiying Yan, Yukun Liu, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li

TL;DR

The paper tackles modality disruption in multimodal fake news detection on social media by introducing FND-MoE, a Mixture-of-Experts framework with modality-specific encoders, cross-attention fusion, and a two-pass feature selection pipeline. The two-pass mechanism combines an attention-based top-$k$ gate with a differentiable Gumbel-Sigmoid selector to filter out disruptive information, followed by a transformer classifier. Empirical results on FakeSV and FVC-2018 show that FND-MoE outperforms state-of-the-art methods, with accuracy gains of $3.45\%$ and $3.71\%$, respectively, and ablations justify the chosen gating strategy. The approach highlights the importance of mitigating modality disruption for robust multimodal fake news detection with practical implications for real-world social-media data analysis.$

Abstract

The rapid growth of social media has led to the widespread dissemination of fake news across multiple content forms, including text, images, audio, and video. Compared to unimodal fake news detection, multimodal fake news detection benefits from the increased availability of information across multiple modalities. However, in the context of social media, certain modalities in multimodal fake news detection tasks may contain disruptive or over-expressive information. These elements often include exaggerated or embellished content. We define this phenomenon as modality disruption and explore its impact on detection models through experiments. To address the issue of modality disruption in a targeted manner, we propose a multimodal fake news detection framework, FND-MoE. Additionally, we design a two-pass feature selection mechanism to further mitigate the impact of modality disruption. Extensive experiments on the FakeSV and FVC-2018 datasets demonstrate that FND-MoE significantly outperforms state-of-the-art methods, with accuracy improvements of 3.45% and 3.71% on the respective datasets compared to baseline models.

Exploring Modality Disruption in Multimodal Fake News Detection

TL;DR

The paper tackles modality disruption in multimodal fake news detection on social media by introducing FND-MoE, a Mixture-of-Experts framework with modality-specific encoders, cross-attention fusion, and a two-pass feature selection pipeline. The two-pass mechanism combines an attention-based top- gate with a differentiable Gumbel-Sigmoid selector to filter out disruptive information, followed by a transformer classifier. Empirical results on FakeSV and FVC-2018 show that FND-MoE outperforms state-of-the-art methods, with accuracy gains of and , respectively, and ablations justify the chosen gating strategy. The approach highlights the importance of mitigating modality disruption for robust multimodal fake news detection with practical implications for real-world social-media data analysis.$

Abstract

The rapid growth of social media has led to the widespread dissemination of fake news across multiple content forms, including text, images, audio, and video. Compared to unimodal fake news detection, multimodal fake news detection benefits from the increased availability of information across multiple modalities. However, in the context of social media, certain modalities in multimodal fake news detection tasks may contain disruptive or over-expressive information. These elements often include exaggerated or embellished content. We define this phenomenon as modality disruption and explore its impact on detection models through experiments. To address the issue of modality disruption in a targeted manner, we propose a multimodal fake news detection framework, FND-MoE. Additionally, we design a two-pass feature selection mechanism to further mitigate the impact of modality disruption. Extensive experiments on the FakeSV and FVC-2018 datasets demonstrate that FND-MoE significantly outperforms state-of-the-art methods, with accuracy improvements of 3.45% and 3.71% on the respective datasets compared to baseline models.

Paper Structure

This paper contains 13 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Potential scenarios that may arise on social media. (a) illustrates that fake news often aims to mislead by manipulating and distorting factual information within a specific modality to create a deceptive narrative. (b) demonstrates that some modalities contain disruptive or overly expressive embellishments designed to attract attention, which does not alter the inherent truthfulness of the news.
  • Figure 2: Architecture of the multimodal fake news detection framework FND-MoE
  • Figure 3: Architecture of two-pass feature selection mechanism. The input set of $\binom{N}{2}$ feature vectors is processed through an attention gate, from which the top k vectors are selected based on their scores. These k feature vectors are then passed through another attention gate, where the Gumbel-Sigmoid function is applied to generate the final output.
  • Figure 4: The weight values output by the model when using two mechanisms.