AutoPsyC: Automatic Recognition of Psychodynamic Conflicts from Semi-structured Interviews with Large Language Models
Sayed Muddashir Hossain, Simon Ostermann, Patrick Gebhard, Cord Benecke, Josef van Genabith, Philipp Müller
TL;DR
AutoPsyC introduces the first LLM-based system to automatically recognize and rate psychodynamic conflicts from full-length OPD interviews. It combines interview summarization, Retrieval-Augmented Generation with domain knowledge from the OPD manual, and segment-wise fine-tuning of four Llama 3.1 models, with predictions fused by a weighted ensemble. On 141 90-minute OPD interviews, AutoPsyC achieves substantial gains over naive baselines, notably with 0.78–0.81 weighted F1 for several conflicts and consistent improvements across configurations. The study demonstrates that mid-interview content is particularly informative, highlights the feasibility of computational psychodynamic assessment, and discusses ethical considerations and the need for cautious, clinician-augmented deployment.
Abstract
Psychodynamic conflicts are persistent, often unconscious themes that shape a person's behaviour and experiences. Accurate diagnosis of psychodynamic conflicts is crucial for effective patient treatment and is commonly done via long, manually scored semi-structured interviews. Existing automated solutions for psychiatric diagnosis tend to focus on the recognition of broad disorder categories such as depression, and it is unclear to what extent psychodynamic conflicts which even the patient themselves may not have conscious access to could be automatically recognised from conversation. In this paper, we propose AutoPsyC, the first method for recognising the presence and significance of psychodynamic conflicts from full-length Operationalized Psychodynamic Diagnostics (OPD) interviews using Large Language Models (LLMs). Our approach combines recent advances in parameter-efficient fine-tuning and Retrieval-Augmented Generation (RAG) with a summarisation strategy to effectively process entire 90 minute long conversations. In evaluations on a dataset of 141 diagnostic interviews we show that AutoPsyC consistently outperforms all baselines and ablation conditions on the recognition of four highly relevant psychodynamic conflicts.
