Table of Contents
Fetching ...

SimClinician: A Multimodal Simulation Testbed for Reliable Psychologist AI Collaboration in Mental Health Diagnosis

Filippo Cenacchi, Longbing Cao, Deborah Richards

TL;DR

The paper tackles the gap that AI-based mental health diagnosis is evaluated by accuracy alone, ignoring how clinicians respond to AI recommendations. It introduces SimClinician, a multimodal, privacy-preserving simulation testbed with a dashboard, avatar-based visualization, a policy-driven decision layer, and a headless validator to study clinician–AI negotiation end-to-end. Using the expanded E-DAIC corpus, it shows that a lightweight confirmation friction boosts acceptance by about 23 percentage points while keeping escalations under 9% and latency under sub-second levels, and it demonstrates improved depression probability calibration with caveats for PTSD due to data imbalance. These results provide a scalable, ethics-aware framework for designing and evaluating clinician-in-the-loop AI before live deployment, informing interface design, risk communication, and governance for mental health diagnostics.

Abstract

AI based mental health diagnosis is often judged by benchmark accuracy, yet in practice its value depends on how psychologists respond whether they accept, adjust, or reject AI suggestions. Mental health makes this especially challenging: decisions are continuous and shaped by cues in tone, pauses, word choice, and nonverbal behaviors of patients. Current research rarely examines how AI diagnosis interface design influences these choices, leaving little basis for reliable testing before live studies. We present SimClinician, an interactive simulation platform, to transform patient data into psychologist AI collaborative diagnosis. Contributions include: (1) a dashboard integrating audio, text, and gaze-expression patterns; (2) an avatar module rendering de-identified dynamics for analysis; (3) a decision layer that maps AI outputs to multimodal evidence, letting psychologists review AI reasoning, and enter a diagnosis. Tested on the E-DAIC corpus (276 clinical interviews, expanded to 480,000 simulations), SimClinician shows that a confirmation step raises acceptance by 23%, keeping escalations below 9%, and maintaining smooth interaction flow.

SimClinician: A Multimodal Simulation Testbed for Reliable Psychologist AI Collaboration in Mental Health Diagnosis

TL;DR

The paper tackles the gap that AI-based mental health diagnosis is evaluated by accuracy alone, ignoring how clinicians respond to AI recommendations. It introduces SimClinician, a multimodal, privacy-preserving simulation testbed with a dashboard, avatar-based visualization, a policy-driven decision layer, and a headless validator to study clinician–AI negotiation end-to-end. Using the expanded E-DAIC corpus, it shows that a lightweight confirmation friction boosts acceptance by about 23 percentage points while keeping escalations under 9% and latency under sub-second levels, and it demonstrates improved depression probability calibration with caveats for PTSD due to data imbalance. These results provide a scalable, ethics-aware framework for designing and evaluating clinician-in-the-loop AI before live deployment, informing interface design, risk communication, and governance for mental health diagnostics.

Abstract

AI based mental health diagnosis is often judged by benchmark accuracy, yet in practice its value depends on how psychologists respond whether they accept, adjust, or reject AI suggestions. Mental health makes this especially challenging: decisions are continuous and shaped by cues in tone, pauses, word choice, and nonverbal behaviors of patients. Current research rarely examines how AI diagnosis interface design influences these choices, leaving little basis for reliable testing before live studies. We present SimClinician, an interactive simulation platform, to transform patient data into psychologist AI collaborative diagnosis. Contributions include: (1) a dashboard integrating audio, text, and gaze-expression patterns; (2) an avatar module rendering de-identified dynamics for analysis; (3) a decision layer that maps AI outputs to multimodal evidence, letting psychologists review AI reasoning, and enter a diagnosis. Tested on the E-DAIC corpus (276 clinical interviews, expanded to 480,000 simulations), SimClinician shows that a confirmation step raises acceptance by 23%, keeping escalations below 9%, and maintaining smooth interaction flow.

Paper Structure

This paper contains 25 sections, 17 figures, 1 table, 1 algorithm.

Figures (17)

  • Figure 1: Overview panel. Entry screen showing PHQ-8 totals, PCL-C severity, and cut-off thresholds. These anchors ground subsequent multimodal analysis in validated clinical metrics.
  • Figure 2: Audio panel. High-resolution spectrogram with rails marking flat prosody, silence, and stress bursts. Threshold sliders encourage parameter probing, making uncertainty explicit.
  • Figure 3: Transcript dashboard. Ribbons show negative/positive language, hedging, and negation over time, alongside a temporal-focus line. Footer metrics summarize lexical diversity, pause ratios, and readability.
  • Figure 4: Quotes view. Clinicians can filter utterances by diagnostic categories, surfacing concise evidence for or against AI suggestions.
  • Figure 5: Keyword contrast tables. Left: top keywords for the full session. Right: keywords restricted to negative/negated utterances, highlighting contrastive salience.
  • ...and 12 more figures