Using LLMs to Aid Annotation and Collection of Clinically-Enriched Data in Bipolar Disorder and Schizophrenia
Ankit Aich, Avery Quynh, Pamela Osseyi, Amy Pinkham, Philip Harvey, Brenda Curtis, Colin Depp, Natalie Parde
TL;DR
This work demonstrates that fine-tuned Seq2Seq language models can effectively assist in both collecting clinically enriched data and annotating it for domain-specific variables in bipolar disorder and schizophrenia. By pairing a context-aware interviewer with a dedicated annotation model, the authors build a scalable pipeline that outperforms large commercial LLMs on domain tasks and maintains high inter-annotator reliability without making diagnostic claims. A chained pipeline further shows end-to-end viability for data collection and scoring with minimal performance loss. The study emphasizes practical utility, ethical safeguards, and potential for broader adoption in clinical research, while acknowledging limitations such as sample size and modality scope.
Abstract
NLP in mental health has been primarily social media focused. Real world practitioners also have high case loads and often domain specific variables, of which modern LLMs lack context. We take a dataset made by recruiting 644 participants, including individuals diagnosed with Bipolar Disorder (BD), Schizophrenia (SZ), and Healthy Controls (HC). Participants undertook tasks derived from a standardized mental health instrument, and the resulting data were transcribed and annotated by experts across five clinical variables. This paper demonstrates the application of contemporary language models in sequence-to-sequence tasks to enhance mental health research. Specifically, we illustrate how these models can facilitate the deployment of mental health instruments, data collection, and data annotation with high accuracy and scalability. We show that small models are capable of annotation for domain-specific clinical variables, data collection for mental-health instruments, and perform better then commercial large models.
