Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Haoyu Zhang; Wenbin Wang; Tianshu Yu

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Haoyu Zhang, Wenbin Wang, Tianshu Yu

TL;DR

The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations.

Abstract

The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness. Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA. The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations. Aside from the methodical design, we perform comprehensive experiments under random data missing scenarios, utilizing diverse and meaningful settings on several popular datasets (\textit{e.g.,} MOSI, MOSEI, and SIMS), providing additional uniformity, transparency, and fairness compared to existing evaluations in the literature. Empirically, LNLN consistently outperforms existing baselines, demonstrating superior performance across these challenging and extensive evaluation metrics.

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

TL;DR

Abstract

Paper Structure (31 sections, 14 equations, 8 figures, 14 tables)

This paper contains 31 sections, 14 equations, 8 figures, 14 tables.

Introduction
Related Work
Multimodal Sentiment Analysis
Robust Representation Learning in MSA
Method
Overview
Input Construction and Multimodal Input
Dominant Modality based Multimodal Learning
Dominant Modality Correction
Reconstructor
Overall Learning Objectives
Experiments and Analysis
Datasets
Evaluation Settings and Criteria
Implementation Details
...and 16 more sections

Figures (8)

Figure 1: Overall pipeline. Note: $H^0_{l}$, $H^0_{v}$, $H^0_{a}$, $H_{cc}$, and $H^0_{p}$ are randomly initialized learnable vectors.
Figure 2: Performance curves of various missing rates. (a), (b) and (c) are the F1 curves on MOSI, MOSEI, and SIMS, respectively. (d), (e) and (f) are the MAE curves on MOSI, MOSEI, and SIMS, respectively. Note: The smaller MAE indicates the better performance.
Figure 3: Visualization of successful and failed predictions. Note: The input is visualized to facilitate readers' understanding. In practice, random data missing is applied to the original input sequence, as described in Section \ref{['sec: method']}.
Figure 4: Data distribution of MOSI, MOSEI, and SIMS datasets.
Figure 5: Seven-category confusion matrix of several representative methods on MOSI dataset. Note: 0-6 denote strongly negative, weakly negative, negative, neutral, weakly positive, positive, and strongly positive, respectively.
...and 3 more figures

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

TL;DR

Abstract

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Authors

TL;DR

Abstract

Table of Contents

Figures (8)