Table of Contents
Fetching ...

WEE-Therapy: A Mixture of Weak Encoders Framework for Psychological Counseling Dialogue Analysis

Yongqi Kang, Yong Zhao

TL;DR

WEE-Therapy tackles the domain adaptation gap in AudioLLMs for psychological counseling by introducing a Mixture of Weak Encoders (MoWE) framework that augments a strong base encoder with a pool of lightweight domain-specialist encoders. A dual-routing mechanism dynamically selects and combines features from the weak encoders with domain priors, producing a rich representation $z_i$ that is fed into an LLM via an adapter and projection for text generation. Training uses a multi-task objective that couples next-token prediction with a MoWE routing loss, encouraging confident and diverse encoder usage while keeping most base and LLM parameters frozen. The approach demonstrates improvements across emotion recognition, counseling technique classification, crisis risk detection, and dialogue summarization, illustrating the practicality of parameter-efficient, domain-adapted AudioLLMs for AI-assisted clinical analysis. This work offers a scalable blueprint for adapting general large models to specialized vertical domains and could extend to other low-resource settings in computational psychology and beyond.

Abstract

The advancement of computational psychology requires AI tools capable of deeply understanding counseling dialogues. Existing audio language models (AudioLLMs) often rely on single speech encoders pre-trained on general data, struggling to capture domain-specific features like complex emotions and professional techniques. To address this, we propose WEE-Therapy, a multi-task AudioLLM incorporating a Weak Encoder Ensemble (WEE) mechanism. This supplements a powerful base encoder with a pool of lightweight, specialized encoders. A novel dual-routing strategy combines stable, data-independent domain knowledge with dynamic, data-dependent expert selection. Evaluated on emotion recognition, technique classification, risk detection, and summarization, WEE-Therapy achieves significant performance gains across all tasks with minimal parameter overhead, demonstrating strong potential for AI-assisted clinical analysis.

WEE-Therapy: A Mixture of Weak Encoders Framework for Psychological Counseling Dialogue Analysis

TL;DR

WEE-Therapy tackles the domain adaptation gap in AudioLLMs for psychological counseling by introducing a Mixture of Weak Encoders (MoWE) framework that augments a strong base encoder with a pool of lightweight domain-specialist encoders. A dual-routing mechanism dynamically selects and combines features from the weak encoders with domain priors, producing a rich representation that is fed into an LLM via an adapter and projection for text generation. Training uses a multi-task objective that couples next-token prediction with a MoWE routing loss, encouraging confident and diverse encoder usage while keeping most base and LLM parameters frozen. The approach demonstrates improvements across emotion recognition, counseling technique classification, crisis risk detection, and dialogue summarization, illustrating the practicality of parameter-efficient, domain-adapted AudioLLMs for AI-assisted clinical analysis. This work offers a scalable blueprint for adapting general large models to specialized vertical domains and could extend to other low-resource settings in computational psychology and beyond.

Abstract

The advancement of computational psychology requires AI tools capable of deeply understanding counseling dialogues. Existing audio language models (AudioLLMs) often rely on single speech encoders pre-trained on general data, struggling to capture domain-specific features like complex emotions and professional techniques. To address this, we propose WEE-Therapy, a multi-task AudioLLM incorporating a Weak Encoder Ensemble (WEE) mechanism. This supplements a powerful base encoder with a pool of lightweight, specialized encoders. A novel dual-routing strategy combines stable, data-independent domain knowledge with dynamic, data-dependent expert selection. Evaluated on emotion recognition, technique classification, risk detection, and summarization, WEE-Therapy achieves significant performance gains across all tasks with minimal parameter overhead, demonstrating strong potential for AI-assisted clinical analysis.

Paper Structure

This paper contains 9 sections, 1 equation, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Overall architecture of the proposed MoWe-Therapy framework.