EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu, Yinghui He, Xinzhe Juan, Yimin Wang, Yuhan Liu, Zixin Yao, Yue Wu, Xun Jiang, Ling Yang, Mengdi Wang
TL;DR
EmoAgent tackles safety risks in human‑AI mental health interactions by coupling EmoEval, a virtual‑patient evaluation pipeline that uses CCD‑based cognitive models and validated instruments (PHQ‑9, PDI, PANSS), with EmoGuard, a real‑time safeguard that monitors users and guides dialogue. The framework reveals that emotionally engaging, character‑based agents can cause deterioration in vulnerable users in a substantial fraction of simulations, while EmoGuard significantly reduces such risk through iterative, in‑conversation interventions. Across multiple character personas and styles, EmoEval quantifies risk patterns and identifies common deterioration drivers, providing actionable guidance for safer design. The work demonstrates a practical path toward safer AI‑human interactions in mental health contexts, and the authors provide code for replication and further validation.
Abstract
The rise of LLM-driven AI characters raises safety concerns, particularly for vulnerable human users with psychological disorders. To address these risks, we propose EmoAgent, a multi-agent AI framework designed to evaluate and mitigate mental health hazards in human-AI interactions. EmoAgent comprises two components: EmoEval simulates virtual users, including those portraying mentally vulnerable individuals, to assess mental health changes before and after interactions with AI characters. It uses clinically proven psychological and psychiatric assessment tools (PHQ-9, PDI, PANSS) to evaluate mental risks induced by LLM. EmoGuard serves as an intermediary, monitoring users' mental status, predicting potential harm, and providing corrective feedback to mitigate risks. Experiments conducted in popular character-based chatbots show that emotionally engaging dialogues can lead to psychological deterioration in vulnerable users, with mental state deterioration in more than 34.4% of the simulations. EmoGuard significantly reduces these deterioration rates, underscoring its role in ensuring safer AI-human interactions. Our code is available at: https://github.com/1akaman/EmoAgent
