Table of Contents
Fetching ...

Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion Recognition

Shuo Zhang, Jinsong Zhang, Zhejun Zhang, Lei Li

TL;DR

This work tackles the joint problem of multimodal sentiment analysis and emotion recognition by addressing parameter conflicts that arise from naive parameter sharing. It introduces Multimodal Mixture of Low-Rank Experts (MMoLRE), which uses shared and task-specific low-rank experts plus a UniTSE module and task-adaptive fusion to model commonalities and differences between tasks while keeping computational costs low. The approach achieves state-of-the-art results on MSA benchmarks (CMU-MOSI and CMU-MOSEI) and competitive performance on MER, supported by extensive ablations and parameter studies. Overall, MMoLRE demonstrates that explicit task separation with low-rank MoE can enhance multi-task learning for multimodal affective computing with substantial parameter savings and scalable capacity.

Abstract

Multi-task learning (MTL) enables the efficient transfer of extra knowledge acquired from other tasks. The high correlation between multimodal sentiment analysis (MSA) and multimodal emotion recognition (MER) supports their joint training. However, existing methods primarily employ hard parameter sharing, ignoring parameter conflicts caused by complex task correlations. In this paper, we present a novel MTL method for MSA and MER, termed Multimodal Mixture of Low-Rank Experts (MMoLRE). MMoLRE utilizes shared and task-specific experts to distinctly model common and unique task characteristics, thereby avoiding parameter conflicts. Additionally, inspired by low-rank structures in the Mixture of Experts (MoE) framework, we design low-rank expert networks to reduce parameter and computational overhead as the number of experts increases. Extensive experiments on the CMU-MOSI and CMU-MOSEI benchmarks demonstrate that MMoLRE achieves state-of-the-art performance on the MSA task and competitive results on the MER task.

Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion Recognition

TL;DR

This work tackles the joint problem of multimodal sentiment analysis and emotion recognition by addressing parameter conflicts that arise from naive parameter sharing. It introduces Multimodal Mixture of Low-Rank Experts (MMoLRE), which uses shared and task-specific low-rank experts plus a UniTSE module and task-adaptive fusion to model commonalities and differences between tasks while keeping computational costs low. The approach achieves state-of-the-art results on MSA benchmarks (CMU-MOSI and CMU-MOSEI) and competitive performance on MER, supported by extensive ablations and parameter studies. Overall, MMoLRE demonstrates that explicit task separation with low-rank MoE can enhance multi-task learning for multimodal affective computing with substantial parameter savings and scalable capacity.

Abstract

Multi-task learning (MTL) enables the efficient transfer of extra knowledge acquired from other tasks. The high correlation between multimodal sentiment analysis (MSA) and multimodal emotion recognition (MER) supports their joint training. However, existing methods primarily employ hard parameter sharing, ignoring parameter conflicts caused by complex task correlations. In this paper, we present a novel MTL method for MSA and MER, termed Multimodal Mixture of Low-Rank Experts (MMoLRE). MMoLRE utilizes shared and task-specific experts to distinctly model common and unique task characteristics, thereby avoiding parameter conflicts. Additionally, inspired by low-rank structures in the Mixture of Experts (MoE) framework, we design low-rank expert networks to reduce parameter and computational overhead as the number of experts increases. Extensive experiments on the CMU-MOSI and CMU-MOSEI benchmarks demonstrate that MMoLRE achieves state-of-the-art performance on the MSA task and competitive results on the MER task.

Paper Structure

This paper contains 19 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The overview of MMoLRE.
  • Figure 2: Analysis of the number of shared low-rank experts and the top-$k$ selection of experts by the task-specific router networks on CMU-MOSEI. Each metric is mapped to the same numerical scale, with darker colors indicating better performance.
  • Figure 3: Ablation study on CMU-MOSEI.