Table of Contents
Fetching ...

When LLM Therapists Become Salespeople: Evaluating Large Language Models for Ethical Motivational Interviewing

Haein Kong, Seonghyeon Moon

TL;DR

This work evaluates whether large language models understand and ethically apply motivational interviewing (MI). Through MI knowledge testing, ethical-response prompts, and a novel Chain-of-Ethic (CoE) prompt, the authors reveal that while LLMs grasp MI principles reasonably well, their ethical alignment lags, enabling unethical MI outputs and weak detection of unethical responses. The Chain-of-Ethic approach improves both the generation of ethical MI responses and the detection of unethical ones, outperforming zero-shot Chain-of-Thought baselines in most cases. The findings emphasize the need for safety evaluations and practical guidelines for deploying LLM-powered psychotherapy, and establish a benchmarking pathway for future safety research. Overall, the work contributes a concrete mitigation strategy (CoE) and proposes a safety-focused framework for evaluating ethical MI in AI systems, with implications for policy and design of ethical AI-assisted therapy tools.

Abstract

Large language models (LLMs) have been actively applied in the mental health field. Recent research shows the promise of LLMs in applying psychotherapy, especially motivational interviewing (MI). However, there is a lack of studies investigating how language models understand MI ethics. Given the risks that malicious actors can use language models to apply MI for unethical purposes, it is important to evaluate their capability of differentiating ethical and unethical MI practices. Thus, this study investigates the ethical awareness of LLMs in MI with multiple experiments. Our findings show that LLMs have a moderate to strong level of knowledge in MI. However, their ethical standards are not aligned with the MI spirit, as they generated unethical responses and performed poorly in detecting unethical responses. We proposed a Chain-of-Ethic prompt to mitigate those risks and improve safety. Finally, our proposed strategy effectively improved ethical MI response generation and detection performance. These findings highlight the need for safety evaluations and guidelines for building ethical LLM-powered psychotherapy.

When LLM Therapists Become Salespeople: Evaluating Large Language Models for Ethical Motivational Interviewing

TL;DR

This work evaluates whether large language models understand and ethically apply motivational interviewing (MI). Through MI knowledge testing, ethical-response prompts, and a novel Chain-of-Ethic (CoE) prompt, the authors reveal that while LLMs grasp MI principles reasonably well, their ethical alignment lags, enabling unethical MI outputs and weak detection of unethical responses. The Chain-of-Ethic approach improves both the generation of ethical MI responses and the detection of unethical ones, outperforming zero-shot Chain-of-Thought baselines in most cases. The findings emphasize the need for safety evaluations and practical guidelines for deploying LLM-powered psychotherapy, and establish a benchmarking pathway for future safety research. Overall, the work contributes a concrete mitigation strategy (CoE) and proposes a safety-focused framework for evaluating ethical MI in AI systems, with implications for policy and design of ethical AI-assisted therapy tools.

Abstract

Large language models (LLMs) have been actively applied in the mental health field. Recent research shows the promise of LLMs in applying psychotherapy, especially motivational interviewing (MI). However, there is a lack of studies investigating how language models understand MI ethics. Given the risks that malicious actors can use language models to apply MI for unethical purposes, it is important to evaluate their capability of differentiating ethical and unethical MI practices. Thus, this study investigates the ethical awareness of LLMs in MI with multiple experiments. Our findings show that LLMs have a moderate to strong level of knowledge in MI. However, their ethical standards are not aligned with the MI spirit, as they generated unethical responses and performed poorly in detecting unethical responses. We proposed a Chain-of-Ethic prompt to mitigate those risks and improve safety. Finally, our proposed strategy effectively improved ethical MI response generation and detection performance. These findings highlight the need for safety evaluations and guidelines for building ethical LLM-powered psychotherapy.

Paper Structure

This paper contains 25 sections, 3 figures, 13 tables.

Figures (3)

  • Figure 1: The overview of this study. Our experiment starts with evaluating LLMs' general knowledge of MI. Then, we collect how LLMs respond to unethical MI requests and annotate this with binary and multi-category labels. Next, we conducted prediction tests to measure how well LLMs identify unethical MI responses. Lastly, we test the effectiveness of a Chain-of-Ethic prompt to improve the performance of the previous tasks.
  • Figure 2: The distribution of binary labels (unethical vs ethical). (a) shows the results for all products (neutral + harmful), (b) shows the results only for neutral products, and (c) shows the results only for harmful products.
  • Figure 3: The distribution of multi-category (0-3; 0,1 - ethical, 2,3 - unethical). (a) shows the results for all products (neutral + harmful), (b) shows the results only for neutral products, and (c) shows the results only for harmful products.