SimClinician: A Multimodal Simulation Testbed for Reliable Psychologist AI Collaboration in Mental Health Diagnosis
Filippo Cenacchi, Longbing Cao, Deborah Richards
TL;DR
The paper tackles the gap that AI-based mental health diagnosis is evaluated by accuracy alone, ignoring how clinicians respond to AI recommendations. It introduces SimClinician, a multimodal, privacy-preserving simulation testbed with a dashboard, avatar-based visualization, a policy-driven decision layer, and a headless validator to study clinician–AI negotiation end-to-end. Using the expanded E-DAIC corpus, it shows that a lightweight confirmation friction boosts acceptance by about 23 percentage points while keeping escalations under 9% and latency under sub-second levels, and it demonstrates improved depression probability calibration with caveats for PTSD due to data imbalance. These results provide a scalable, ethics-aware framework for designing and evaluating clinician-in-the-loop AI before live deployment, informing interface design, risk communication, and governance for mental health diagnostics.
Abstract
AI based mental health diagnosis is often judged by benchmark accuracy, yet in practice its value depends on how psychologists respond whether they accept, adjust, or reject AI suggestions. Mental health makes this especially challenging: decisions are continuous and shaped by cues in tone, pauses, word choice, and nonverbal behaviors of patients. Current research rarely examines how AI diagnosis interface design influences these choices, leaving little basis for reliable testing before live studies. We present SimClinician, an interactive simulation platform, to transform patient data into psychologist AI collaborative diagnosis. Contributions include: (1) a dashboard integrating audio, text, and gaze-expression patterns; (2) an avatar module rendering de-identified dynamics for analysis; (3) a decision layer that maps AI outputs to multimodal evidence, letting psychologists review AI reasoning, and enter a diagnosis. Tested on the E-DAIC corpus (276 clinical interviews, expanded to 480,000 simulations), SimClinician shows that a confirmation step raises acceptance by 23%, keeping escalations below 9%, and maintaining smooth interaction flow.
