Conceptualizing Suicidal Behavior: Utilizing Explanations of Predicted Outcomes to Analyze Longitudinal Social Media Data
Van Minh Nguyen, Nasheen Nur, William Stern, Thomas Mercer, Chiradeep Sen, Siddhartha Bhattacharyya, Victor Tumbiolo, Seng Jhing Goh
TL;DR
The paper tackles early detection of suicidal ideation from social media under data collection and deployment constraints. It proposes a cost-efficient framework that uses Layer Integrated Gradients to derive token-level attributions from LLMs, combined with TF-IDF scaling, enabling screen of long-context Reddit histories without running large models in inference. Evaluations across multiple encoders on the UMD dataset identify MentalRoBERTa-base as the best-performing model, and show that attribution-based explanations can improve interpretability and guide preliminary screening, especially in longitudinal data when class labels are merged. The work highlights practical potential for mental health prevention with caution about ethics and data privacy, and outlines future directions for multimodal and time-series analyses.
Abstract
The COVID-19 pandemic has escalated mental health crises worldwide, with social isolation and economic instability contributing to a rise in suicidal behavior. Suicide can result from social factors such as shame, abuse, abandonment, and mental health conditions like depression, Post-Traumatic Stress Disorder (PTSD), Attention-Deficit/Hyperactivity Disorder (ADHD), anxiety disorders, and bipolar disorders. As these conditions develop, signs of suicidal ideation may manifest in social media interactions. Analyzing social media data using artificial intelligence (AI) techniques can help identify patterns of suicidal behavior, providing invaluable insights for suicide prevention agencies, professionals, and broader community awareness initiatives. Machine learning algorithms for this purpose require large volumes of accurately labeled data. Previous research has not fully explored the potential of incorporating explanations in analyzing and labeling longitudinal social media data. In this study, we employed a model explanation method, Layer Integrated Gradients, on top of a fine-tuned state-of-the-art language model, to assign each token from Reddit users' posts an attribution score for predicting suicidal ideation. By extracting and analyzing attributions of tokens from the data, we propose a methodology for preliminary screening of social media posts for suicidal ideation without using large language models during inference.
