Table of Contents
Fetching ...

Therapeutic AI and the Hidden Risks of Over-Disclosure: An Embedded AI-Literacy Framework for Mental Health Privacy

Soraya S. Anvari, Rina R. Wehbe

TL;DR

This work addresses privacy and safety risks in AI-assisted mental health by embedding an AI literacy framework directly into conversational systems. It introduces a privacy-preserving wrapper with three modules—Prompt Coach, Disclosure Monitor, and Transparency Engine—that operate locally to guide prompt quality, regulate disclosures, and provide transparent data handling explanations. The framework emphasizes on-device processing to minimize data leakage while enhancing user trust and engagement. A planned longitudinal study aims to quantify improvements in prompting, disclosure safety, trust, and usability, informing future refinements and clinical collaborations.

Abstract

Large Language Models (LLMs) are increasingly deployed in mental health contexts, from structured therapeutic support tools to informal chat-based well-being assistants. While these systems increase accessibility, scalability, and personalization, their integration into mental health care brings privacy and safety challenges that have not been well-examined. Unlike traditional clinical interactions, LLM-mediated therapy often lacks a clear structure for what information is collected, how it is processed, and how it is stored or reused. Users without clinical guidance may over-disclose personal information, which is sometimes irrelevant to their presenting concern, due to misplaced trust, lack of awareness of data risks, or the conversational design of the system. This overexposure raises privacy concerns and also increases the potential for LLM bias, misinterpretation, and long-term data misuse. We propose a framework embedding Artificial Intelligence (AI) literacy interventions directly into mental health conversational systems, and outline a study plan to evaluate their impact on disclosure safety, trust, and user experience.

Therapeutic AI and the Hidden Risks of Over-Disclosure: An Embedded AI-Literacy Framework for Mental Health Privacy

TL;DR

This work addresses privacy and safety risks in AI-assisted mental health by embedding an AI literacy framework directly into conversational systems. It introduces a privacy-preserving wrapper with three modules—Prompt Coach, Disclosure Monitor, and Transparency Engine—that operate locally to guide prompt quality, regulate disclosures, and provide transparent data handling explanations. The framework emphasizes on-device processing to minimize data leakage while enhancing user trust and engagement. A planned longitudinal study aims to quantify improvements in prompting, disclosure safety, trust, and usability, informing future refinements and clinical collaborations.

Abstract

Large Language Models (LLMs) are increasingly deployed in mental health contexts, from structured therapeutic support tools to informal chat-based well-being assistants. While these systems increase accessibility, scalability, and personalization, their integration into mental health care brings privacy and safety challenges that have not been well-examined. Unlike traditional clinical interactions, LLM-mediated therapy often lacks a clear structure for what information is collected, how it is processed, and how it is stored or reused. Users without clinical guidance may over-disclose personal information, which is sometimes irrelevant to their presenting concern, due to misplaced trust, lack of awareness of data risks, or the conversational design of the system. This overexposure raises privacy concerns and also increases the potential for LLM bias, misinterpretation, and long-term data misuse. We propose a framework embedding Artificial Intelligence (AI) literacy interventions directly into mental health conversational systems, and outline a study plan to evaluate their impact on disclosure safety, trust, and user experience.

Paper Structure

This paper contains 9 sections, 1 figure.

Figures (1)

  • Figure 1: Architecture and flow of the Embedded AI Literacy Framework, showing user interaction, literacy modules (Prompt Coach, Disclosure Monitor, Transparency Engine), and LLM integration.