Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis

Jinhao Lin; Yifei Wang; Yanwu Xu; Qi Liu

Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis

Jinhao Lin, Yifei Wang, Yanwu Xu, Qi Liu

TL;DR

This paper addresses the high annotation cost and label ambiguity in multimodal sentiment analysis by introducing Semi-IIN, a semi-supervised framework that dynamically balances intra- and inter-modal interactions via masked attention and a gate-based fusion. It combines two dedicated attention streams, IntraMA and InterMA, with a self-training scheme that generates reliable pseudo-labels from unlabeled data, and optimizes using a combination of $L_v$ (MSE), $L_e$ (cross-entropy), and $L_e^u$ losses. The approach reports state-of-the-art performance on MOSI and MOSEI, supported by extensive ablations and qualitative analyses, and provides insights into the effectiveness of separate intra- and inter-modal pathways for robust sentiment understanding. The work advances practical multimodal sentiment analysis by reducing labeling requirements and improving resilience to unlabeled data while enabling interpretable intra- and inter-modal interactions.

Abstract

Despite multimodal sentiment analysis being a fertile research ground that merits further investigation, current approaches take up high annotation cost and suffer from label ambiguity, non-amicable to high-quality labeled data acquisition. Furthermore, choosing the right interactions is essential because the significance of intra- or inter-modal interactions can differ among various samples. To this end, we propose Semi-IIN, a Semi-supervised Intra-inter modal Interaction learning Network for multimodal sentiment analysis. Semi-IIN integrates masked attention and gating mechanisms, enabling effective dynamic selection after independently capturing intra- and inter-modal interactive information. Combined with the self-training approach, Semi-IIN fully utilizes the knowledge learned from unlabeled data. Experimental results on two public datasets, MOSI and MOSEI, demonstrate the effectiveness of Semi-IIN, establishing a new state-of-the-art on several metrics. Code is available at https://github.com/flow-ljh/Semi-IIN.

Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis

TL;DR

Abstract

Semi-IIN: Semi-supervised Intra-inter modal Interaction Learning Network for Multimodal Sentiment Analysis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)