Automating PTSD Diagnostics in Clinical Interviews: Leveraging Large Language Models for Trauma Assessments
Sichang Tu, Abigail Powers, Natalie Merrill, Negar Fani, Sierra Carter, Stephen Doogan, Jinho D. Choi
TL;DR
The paper presents an end-to-end framework that automates PTSD diagnostic assessments from clinician-administered interviews by leveraging GPT-4 and Llama-2. It introduces a large, clinician-led PTSD interview dataset and an annotation pipeline that transforms conversations into structured variables via an assessment-pairing scheme. Empirical results show GPT-4 generally achieves higher accuracy than Llama-2 across variable types, with promising potential for scaling diagnostic workflows while highlighting sources of error such as misalignment and transcription issues. The work demonstrates the feasibility of AI-assisted diagnostic validation in mental health and emphasizes careful deployment with clinician oversight and privacy considerations.
Abstract
The shortage of clinical workforce presents significant challenges in mental healthcare, limiting access to formal diagnostics and services. We aim to tackle this shortage by integrating a customized large language model (LLM) into the workflow, thus promoting equity in mental healthcare for the general population. Although LLMs have showcased their capability in clinical decision-making, their adaptation to severe conditions like Post-traumatic Stress Disorder (PTSD) remains largely unexplored. Therefore, we collect 411 clinician-administered diagnostic interviews and devise a novel approach to obtain high-quality data. Moreover, we build a comprehensive framework to automate PTSD diagnostic assessments based on interview contents by leveraging two state-of-the-art LLMs, GPT-4 and Llama-2, with potential for broader clinical diagnoses. Our results illustrate strong promise for LLMs, tested on our dataset, to aid clinicians in diagnostic validation. To the best of our knowledge, this is the first AI system that fully automates assessments for mental illness based on clinician-administered interviews.
