In-context learning capabilities of Large Language Models to detect suicide risk among adolescents from speech transcripts

Filomene Roquefort; Alexandre Ducorroy; Rachid Riad

In-context learning capabilities of Large Language Models to detect suicide risk among adolescents from speech transcripts

Filomene Roquefort, Alexandre Ducorroy, Rachid Riad

TL;DR

The paper tackles scalable detection of suicide risk in adolescents under privacy constraints by leveraging transcript-based analysis with in-context learning from Large Language Models, guided by the DSPy prompting framework. By evaluating multiple LLMs with zero-shot, few-shot (notably 4-shot with Gemma2-9b), and Chain-of-Thought prompts on the SW1 Chinese adolescent dataset, the authors achieve up to 0.68 accuracy on the test set and demonstrate robust effects of example count on performance. Statistical analysis reveals model-type and model-size interactions, with larger models offering higher baseline accuracy but diminishing gains from additional in-context examples ($R^2=0.134$). The work demonstrates a privacy-preserving, scalable pathway for automated suicide risk assessment from speech transcripts, while outlining future efforts to improve interpretability and clinical integration.

Abstract

Early suicide risk detection in adolescents is critical yet hindered by scalability challenges of current assessments. This paper presents our approach to the first SpeechWellness Challenge (SW1), which aims to assess suicide risk in Chinese adolescents through speech analysis. Due to speech anonymization constraints, we focused on linguistic features, leveraging Large Language Models (LLMs) for transcript-based classification. Using DSPy for systematic prompt engineering, we developed a robust in-context learning approach that outperformed traditional fine-tuning on both linguistic and acoustic markers. Our systems achieved third and fourth places among 180+ submissions, with 0.68 accuracy (F1=0.7) using only transcripts. Ablation analyses showed that increasing prompt example improved performance (p=0.003), with varying effects across model types and sizes. These findings advance automated suicide risk assessment and demonstrate LLMs' value in mental health applications.

In-context learning capabilities of Large Language Models to detect suicide risk among adolescents from speech transcripts

TL;DR

Abstract

In-context learning capabilities of Large Language Models to detect suicide risk among adolescents from speech transcripts

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)