ChatThero: An LLM-Supported Chatbot for Behavior Change and Therapeutic Support in Addiction Recovery
Junda Wang, Zonghai Yao, Lingxi Li, Junhui Qian, Zhichao Yang, Hong Yu
TL;DR
ChatThero introduces a memory-persistent, stressor-aware LLM chatbot for addiction recovery, addressing relapse risk and limited access to ongoing care. It models therapy as a multi-agent system with a Patient Agent, Environment Agent, and Therapy Agent, trained through supervised fine-tuning and direct preference optimization to learn MI/CBT strategies and cross-session carryover. The approach uses anonymized Reddit-derived profiles and a stressor ledger to simulate 3–6 session trajectories, evaluating both automatic and human clinician ratings on motivation, confidence, empathy, and clinical relevance. Results showChatThero outperforming baselines in single- and multi-session settings, especially for harder patient profiles, and demonstrate robustness to stressors, offering a scalable framework for addiction recovery with important ethical considerations and directions for real-world validation.
Abstract
Substance use disorders (SUDs) affect millions of people, and relapses are common, requiring multi-session treatments. Access to care is limited, which contributes to the challenge of recovery support. We present \textbf{ChatThero}, an innovative low-cost, multi-session, stressor-aware, and memory-persistent autonomous \emph{language agent} designed to facilitate long-term behavior change and therapeutic support in addiction recovery. Unlike existing work that mostly finetuned large language models (LLMs) on patient-therapist conversation data, ChatThero was trained in a multi-agent simulated environment that mirrors real therapy. We created anonymized patient profiles from recovery communities (e.g., Reddit). We classify patients as \texttt{easy}, \texttt{medium}, and \texttt{difficult}, three scales representing their resistance to recovery. We created an external environment by introducing stressors (e.g., social determinants of health) to simulate real-world situations. We dynamically inject clinically-grounded therapeutic strategies (motivational interview and cognitive behavioral therapy). Our evaluation, conducted by both human (blinded clinicians) and LLM-as-Judge, shows that ChatThero is superior in empathy and clinical relevance. We show that stressor simulation improves robustness of ChatThero. Explicit stressors increase relapse-like setbacks, matching real-world patterns. We evaluate ChatThero with behavioral change metrics. On a 1--5 scale, ChatThero raises \texttt{motivation} by $+1.71$ points (from $2.39$ to $4.10$) and \texttt{confidence} by $+1.67$ points (from $1.52$ to $3.19$), substantially outperforming GPT-5. On \texttt{difficult} patients, ChatThero reaches the success milestone with $26\%$ fewer turns than GPT-5.
