Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review

Ambre Marie; Marine Garnier; Thomas Bertin; Laura Machart; Guillaume Dardenne; Gwenolé Quellec; Sofian Berrouiguet

Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review

Ambre Marie, Marine Garnier, Thomas Bertin, Laura Machart, Guillaume Dardenne, Gwenolé Quellec, Sofian Berrouiguet

TL;DR

This systematic review addresses how acoustic speech features analyzed with AI and ML can aid suicide risk assessment. It synthesizes 33 studies up to February 2025, highlighting consistent distinctions in features such as jitter, fundamental frequency (F0), MFCC, and PSD between at-risk and not-at-risk individuals, with multimodal data fusion offering the strongest predictive gains. Despite impressive performance in some datasets (AUC up to 0.985 and accuracy up to 99.85%), substantial heterogeneity, class-imbalance, and population bias limit generalizability, underscoring the need for standardized data collection and longitudinal studies. The review also emphasizes ethical considerations, data security, and the value of multimodal integration (acoustic, linguistic, visual, and metadata) to improve specificity and robustness in real-world settings. Overall, acoustic-based suicide risk assessment shows potential as a noninvasive tool to augment clinical decision-making, provided rigorous validation and ethical safeguards are in place.

Abstract

Suicide remains a public health challenge, necessitating improved detection methods to facilitate timely intervention and treatment. This systematic review evaluates the role of Artificial Intelligence (AI) and Machine Learning (ML) in assessing suicide risk through acoustic analysis of speech. Following PRISMA guidelines, we analyzed 33 articles selected from PubMed, Cochrane, Scopus, and Web of Science databases. The last search was conducted in February 2025. Risk of bias was assessed using the PROBAST tool. Studies analyzing acoustic features between individuals at risk of suicide (RS) and those not at risk (NRS) were included, while studies lacking acoustic data, a suicide-related focus, or sufficient methodological details were excluded. Sample sizes varied widely and were reported in terms of participants or speech segments, depending on the study. Results were synthesized narratively based on acoustic features and classifier performance. Findings consistently showed significant acoustic feature variations between RS and NRS populations, particularly involving jitter, fundamental frequency (F0), Mel-frequency cepstral coefficients (MFCC), and power spectral density (PSD). Classifier performance varied based on algorithms, modalities, and speech elicitation methods, with multimodal approaches integrating acoustic, linguistic, and metadata features demonstrating superior performance. Among the 29 classifier-based studies, reported AUC values ranged from 0.62 to 0.985 and accuracies from 60% to 99.85%. Most datasets were imbalanced in favor of NRS, and performance metrics were rarely reported separately by group, limiting clear identification of direction of effect.

Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review

TL;DR

Abstract

Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)