MViR: Multi-View Visual-Semantic Representation for Fake News Detection

Haochen Liang; Xinqi Su; Jun Wang; Chaomeng Chen; Zitong Yu

MViR: Multi-View Visual-Semantic Representation for Fake News Detection

Haochen Liang, Xinqi Su, Jun Wang, Chaomeng Chen, Zitong Yu

TL;DR

This work proposes a Multi-View Visual-Semantic Representation (MViR) framework, which includes a Multi-View Representation module using pyramid dilated convolution to capture multi-view visual-semantic features, a Multi-View Feature Fusion module to integrate these features with text, and multiple aggregators to extract multi-view semantic cues for detection.

Abstract

With the rise of online social networks, detecting fake news accurately is essential for a healthy online environment. While existing methods have advanced multimodal fake news detection, they often neglect the multi-view visual-semantic aspects of news, such as different text perspectives of the same image. To address this, we propose a Multi-View Visual-Semantic Representation (MViR) framework. Our approach includes a Multi-View Representation module using pyramid dilated convolution to capture multi-view visual-semantic features, a Multi-View Feature Fusion module to integrate these features with text, and multiple aggregators to extract multi-view semantic cues for detection. Experiments on benchmark datasets demonstrate the superiority of MViR. The source code of FedCoop is available at https://github.com/FlowerinZDF/FakeNews-MVIR.

MViR: Multi-View Visual-Semantic Representation for Fake News Detection

TL;DR

Abstract

Paper Structure (13 sections, 12 equations, 3 figures, 4 tables)

This paper contains 13 sections, 12 equations, 3 figures, 4 tables.

Introduction
Methodology
Feature Extraction
Multi-View Representation
Multi-View Feature Fusion
Multi-View Aggregation
Objective Function
Experiments
Datasets and Experimental Settings
Performance Comparison
Ablation Study
Parameter Sensitivity Analysis
Conclusion

Figures (3)

Figure 1: Motivation of our proposed MViR. We can see that different news texts describe the same image from various perspectives. For instance, some focus on the background building, others on the sign, and some on the person.
Figure 2: The MViR framework consists of three modules: Multi-View Representation (MVR), Multi-View Feature Fusion (MVFF), and Multi-View Aggregation (MVA). It extracts image and text features, learns multi-view visual-semantic representations via MVR, fuses features with MVFF, and uses MVA to generate embeddings and predict fake news probabilities.
Figure 3: Analysis for different numbers of views.

MViR: Multi-View Visual-Semantic Representation for Fake News Detection

TL;DR

Abstract

MViR: Multi-View Visual-Semantic Representation for Fake News Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (3)