When LLM Therapists Become Salespeople: Evaluating Large Language Models for Ethical Motivational Interviewing
Haein Kong, Seonghyeon Moon
TL;DR
This work evaluates whether large language models understand and ethically apply motivational interviewing (MI). Through MI knowledge testing, ethical-response prompts, and a novel Chain-of-Ethic (CoE) prompt, the authors reveal that while LLMs grasp MI principles reasonably well, their ethical alignment lags, enabling unethical MI outputs and weak detection of unethical responses. The Chain-of-Ethic approach improves both the generation of ethical MI responses and the detection of unethical ones, outperforming zero-shot Chain-of-Thought baselines in most cases. The findings emphasize the need for safety evaluations and practical guidelines for deploying LLM-powered psychotherapy, and establish a benchmarking pathway for future safety research. Overall, the work contributes a concrete mitigation strategy (CoE) and proposes a safety-focused framework for evaluating ethical MI in AI systems, with implications for policy and design of ethical AI-assisted therapy tools.
Abstract
Large language models (LLMs) have been actively applied in the mental health field. Recent research shows the promise of LLMs in applying psychotherapy, especially motivational interviewing (MI). However, there is a lack of studies investigating how language models understand MI ethics. Given the risks that malicious actors can use language models to apply MI for unethical purposes, it is important to evaluate their capability of differentiating ethical and unethical MI practices. Thus, this study investigates the ethical awareness of LLMs in MI with multiple experiments. Our findings show that LLMs have a moderate to strong level of knowledge in MI. However, their ethical standards are not aligned with the MI spirit, as they generated unethical responses and performed poorly in detecting unethical responses. We proposed a Chain-of-Ethic prompt to mitigate those risks and improve safety. Finally, our proposed strategy effectively improved ethical MI response generation and detection performance. These findings highlight the need for safety evaluations and guidelines for building ethical LLM-powered psychotherapy.
