Prevalence of Security and Privacy Risk-Inducing Usage of AI-based Conversational Agents
Kathrin Grosse, Nico Ebert
TL;DR
This study addresses security and privacy risks in AI-based conversational agents by conducting a large, demographically representative UK survey of regular CA users. It combines a three-stage questionnaire (demographics, usage behaviors, and privacy concerns) with rigorous statistical analyses to quantify prevalence of risk-inducing practices, such as uploading non-self-created content, granting program access, and jailbreaking. The key findings reveal substantial exposure to risk surfaces: roughly one-third share potentially insecure inputs or grant program access, about a quarter engage in jailbreaking, and many users redact or avoid sharing sensitive data, though some highly sensitive information is still disclosed; awareness of data-use policies and opt-out options remains limited. The results underscore an urgent need for AI guardrails, clearer data-use transparency from vendors, and organizational policies to mitigate threats in security- and privacy-critical contexts, while guiding future research on context-specific predictors and defense strategies.
Abstract
Recent improvement gains in large language models (LLMs) have lead to everyday usage of AI-based Conversational Agents (CAs). At the same time, LLMs are vulnerable to an array of threats, including jailbreaks and, for example, causing remote code execution when fed specific inputs. As a result, users may unintentionally introduce risks, for example, by uploading malicious files or disclosing sensitive information. However, the extent to which such user behaviors occur and thus potentially facilitate exploits remains largely unclear. To shed light on this issue, we surveyed a representative sample of 3,270 UK adults in 2024 using Prolific. A third of these use CA services such as ChatGPT or Gemini at least once a week. Of these ``regular users'', up to a third exhibited behaviors that may enable attacks, and a fourth have tried jailbreaking (often out of understandable reasons such as curiosity, fun or information seeking). Half state that they sanitize data and most participants report not sharing sensitive data. However, few share very sensitive data such as passwords. The majority are unaware that their data can be used to train models and that they can opt-out. Our findings suggest that current academic threat models manifest in the wild, and mitigations or guidelines for the secure usage of CAs should be developed. In areas critical to security and privacy, CAs must be equipped with effective AI guardrails to prevent, for example, revealing sensitive information to curious employees. Vendors need to increase efforts to prevent the entry of sensitive data, and to create transparency with regard to data usage policies and settings.
