3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

Rina Carines Cabral; Siwen Luo; Josiah Poon; Soyeon Caren Han

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

Rina Carines Cabral, Siwen Luo, Josiah Poon, Soyeon Caren Han

TL;DR

This work introduces a Multimodal and Multi-Teacher Knowledge Distillation model for Mental Health Classification, leveraging insights from cross-modal human understanding, and addresses the challenge of appropriately representing inputs of varying natures.

Abstract

The significance of mental health classification is paramount in contemporary society, where digital platforms serve as crucial sources for monitoring individuals' well-being. However, existing social media mental health datasets primarily consist of text-only samples, potentially limiting the efficacy of models trained on such data. Recognising that humans utilise cross-modal information to comprehend complex situations or issues, we present a novel approach to address the limitations of current methodologies. In this work, we introduce a Multimodal and Multi-Teacher Knowledge Distillation model for Mental Health Classification, leveraging insights from cross-modal human understanding. Unlike conventional approaches that often rely on simple concatenation to integrate diverse features, our model addresses the challenge of appropriately representing inputs of varying natures (e.g., texts and sounds). To mitigate the computational complexity associated with integrating all features into a single model, we employ a multimodal and multi-teacher architecture. By distributing the learning process across multiple teachers, each specialising in a particular feature extraction aspect, we enhance the overall mental health classification performance. Through experimental validation, we demonstrate the efficacy of our model in achieving improved performance.

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

TL;DR

Abstract

Paper Structure (24 sections, 6 figures, 10 tables)

This paper contains 24 sections, 6 figures, 10 tables.

Introduction
Related Works
Mental Health Classification
Multi-teacher Knowledge Distillation
3M-Health
Multimodal Multi-Teacher Construction
Text-based Teacher
Emotion-based Teacher
Audio-based Teacher
Multimodal Multi-Teacher Fine-tuning
Multi-Teacher Knowledge Distillation
Experimental Setup
Datasets
Text-to-Audio Generators
Baselines and Metrics
...and 9 more sections

Figures (6)

Figure 1: Architecture of 3M-Health: Multimodal Multi-teacher Knowledge Distillation for Mental Health Detection.
Figure 2: Class distribution. For (a) TwitSuicide, SI: Safe to Ignore; PC: Possibly Concerning; SC: Strongly Concerning. For (b) DEPTWEET, ND: Non-depression; MI: Mild; MO: Moderate; SE: Severe. For (c) IdenDep, NDE: Non-depression; DE: Depression. For (d) SDCNL, DEP: Depression; SUI: Suicide.
Figure 3: Audio length comparison. ch: character average
Figure 4: Audio analysis using PCA on spectrogram images of audio samples grouped by a maximum of 10s (left) and 10-25s (right). Each sample is labelled with an ID for reference to corresponding texts provided in the Supplementary Material.
Figure 5: Distribution of multi-label emotion class labels.
...and 1 more figures

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

TL;DR

Abstract

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (6)