Table of Contents
Fetching ...

IDRL: An Individual-Aware Multimodal Depression-Related Representation Learning Framework for Depression Diagnosis

Chongxiao Wang, Junjie Liang, Peng Cao, Jinzhu Yang, Osmar R. Zaiane

Abstract

Depression is a severe mental disorder, and reliable identification plays a critical role in early intervention and treatment. Multimodal depression detection aims to improve diagnostic performance by jointly modeling complementary information from multiple modalities. Recently, numerous multimodal learning approaches have been proposed for depression analysis; however, these methods suffer from the following limitations: 1) inter-modal inconsistency and depression-unrelated interference, where depression-related cues may conflict across modalities while substantial irrelevant content obscures critical depressive signals, and 2) diverse individual depressive presentations, leading to individual differences in modality and cue importance that hinder reliable fusion. To address these issues, we propose Individual-aware Multimodal Depression-related Representation Learning Framework (IDRL) for robust depression diagnosis. Specifically, IDRL 1) disentangles multimodal representations into a modality-common depression space, a modality-specific depression space, and a depression-unrelated space to enhance modality alignment while suppressing irrelevant information, and 2) introduces an individual-aware modality-fusion module (IAF) that dynamically adjusts the weights of disentangled depression-related features based on their predictive significance, thereby achieving adaptive cross-modal fusion for different individuals. Extensive experiments demonstrate that IDRL achieves superior and robust performance for multimodal depression detection.

IDRL: An Individual-Aware Multimodal Depression-Related Representation Learning Framework for Depression Diagnosis

Abstract

Depression is a severe mental disorder, and reliable identification plays a critical role in early intervention and treatment. Multimodal depression detection aims to improve diagnostic performance by jointly modeling complementary information from multiple modalities. Recently, numerous multimodal learning approaches have been proposed for depression analysis; however, these methods suffer from the following limitations: 1) inter-modal inconsistency and depression-unrelated interference, where depression-related cues may conflict across modalities while substantial irrelevant content obscures critical depressive signals, and 2) diverse individual depressive presentations, leading to individual differences in modality and cue importance that hinder reliable fusion. To address these issues, we propose Individual-aware Multimodal Depression-related Representation Learning Framework (IDRL) for robust depression diagnosis. Specifically, IDRL 1) disentangles multimodal representations into a modality-common depression space, a modality-specific depression space, and a depression-unrelated space to enhance modality alignment while suppressing irrelevant information, and 2) introduces an individual-aware modality-fusion module (IAF) that dynamically adjusts the weights of disentangled depression-related features based on their predictive significance, thereby achieving adaptive cross-modal fusion for different individuals. Extensive experiments demonstrate that IDRL achieves superior and robust performance for multimodal depression detection.
Paper Structure (24 sections, 5 equations, 3 figures, 4 tables)

This paper contains 24 sections, 5 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: The video-audio mutual information score of each time segment is calculated on the AVEC-2014 depression dataset. Higher mutual information scores indicate stronger consistency across modalities within the corresponding segments.
  • Figure 2: (a)The proposed Individual-aware Multimodal Depression-related Representation Learning Framework (IDRL) consists of three key stages: multimodal feature extraction, depression-representation disentanglement, and feature fusion and prediction.(b)Depression Representation Disentanglement Module (DRD).(c)Individual-Aware Modality-Fusion Module (IAF).
  • Figure 3: Visualization of (a) disentangled depression-related and unrelated features and (b) the effect of task-unrelated loss on depression prediction.