Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers
Jared Moore, Declan Grabb, William Agnew, Kevin Klyman, Stevie Chancellor, Desmond C. Ong, Nick Haber
TL;DR
This paper investigates whether large language models (LLMs) can safely replace mental health providers. It builds a guideline-driven evaluation by mapping ten major US/UK therapeutic guidelines into 17 core features of an effective therapeutic relationship and tests LLMs against these features. The authors report that contemporary LLMs express stigma toward mental illness and give unsafe or misguided responses to crises, delusions, and suicidality, even in larger models, highlighting gaps in safety practices. They conclude that LLMs should not replace therapists and discuss constructive roles for LLMs as adjuncts, decision-support tools, or standardized training aids, with emphasis on human oversight and safety.
Abstract
Should a large language model (LLM) be used as a therapist? In this paper, we investigate the use of LLMs to *replace* mental health providers, a use case promoted in the tech startup and research space. We conduct a mapping review of therapy guides used by major medical institutions to identify crucial aspects of therapeutic relationships, such as the importance of a therapeutic alliance between therapist and client. We then assess the ability of LLMs to reproduce and adhere to these aspects of therapeutic relationships by conducting several experiments investigating the responses of current LLMs, such as `gpt-4o`. Contrary to best practices in the medical community, LLMs 1) express stigma toward those with mental health conditions and 2) respond inappropriately to certain common (and critical) conditions in naturalistic therapy settings -- e.g., LLMs encourage clients' delusional thinking, likely due to their sycophancy. This occurs even with larger and newer LLMs, indicating that current safety practices may not address these gaps. Furthermore, we note foundational and practical barriers to the adoption of LLMs as therapists, such as that a therapeutic alliance requires human characteristics (e.g., identity and stakes). For these reasons, we conclude that LLMs should not replace therapists, and we discuss alternative roles for LLMs in clinical therapy.
