Table of Contents
Fetching ...

MViR: Multi-View Visual-Semantic Representation for Fake News Detection

Haochen Liang, Xinqi Su, Jun Wang, Chaomeng Chen, Zitong Yu

TL;DR

This work proposes a Multi-View Visual-Semantic Representation (MViR) framework, which includes a Multi-View Representation module using pyramid dilated convolution to capture multi-view visual-semantic features, a Multi-View Feature Fusion module to integrate these features with text, and multiple aggregators to extract multi-view semantic cues for detection.

Abstract

With the rise of online social networks, detecting fake news accurately is essential for a healthy online environment. While existing methods have advanced multimodal fake news detection, they often neglect the multi-view visual-semantic aspects of news, such as different text perspectives of the same image. To address this, we propose a Multi-View Visual-Semantic Representation (MViR) framework. Our approach includes a Multi-View Representation module using pyramid dilated convolution to capture multi-view visual-semantic features, a Multi-View Feature Fusion module to integrate these features with text, and multiple aggregators to extract multi-view semantic cues for detection. Experiments on benchmark datasets demonstrate the superiority of MViR. The source code of FedCoop is available at https://github.com/FlowerinZDF/FakeNews-MVIR.

MViR: Multi-View Visual-Semantic Representation for Fake News Detection

TL;DR

This work proposes a Multi-View Visual-Semantic Representation (MViR) framework, which includes a Multi-View Representation module using pyramid dilated convolution to capture multi-view visual-semantic features, a Multi-View Feature Fusion module to integrate these features with text, and multiple aggregators to extract multi-view semantic cues for detection.

Abstract

With the rise of online social networks, detecting fake news accurately is essential for a healthy online environment. While existing methods have advanced multimodal fake news detection, they often neglect the multi-view visual-semantic aspects of news, such as different text perspectives of the same image. To address this, we propose a Multi-View Visual-Semantic Representation (MViR) framework. Our approach includes a Multi-View Representation module using pyramid dilated convolution to capture multi-view visual-semantic features, a Multi-View Feature Fusion module to integrate these features with text, and multiple aggregators to extract multi-view semantic cues for detection. Experiments on benchmark datasets demonstrate the superiority of MViR. The source code of FedCoop is available at https://github.com/FlowerinZDF/FakeNews-MVIR.
Paper Structure (13 sections, 12 equations, 3 figures, 4 tables)

This paper contains 13 sections, 12 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Motivation of our proposed MViR. We can see that different news texts describe the same image from various perspectives. For instance, some focus on the background building, others on the sign, and some on the person.
  • Figure 2: The MViR framework consists of three modules: Multi-View Representation (MVR), Multi-View Feature Fusion (MVFF), and Multi-View Aggregation (MVA). It extracts image and text features, learns multi-view visual-semantic representations via MVR, fuses features with MVFF, and uses MVA to generate embeddings and predict fake news probabilities.
  • Figure 3: Analysis for different numbers of views.