Table of Contents
Fetching ...

Privacy Leakage Overshadowed by Views of AI: A Study on Human Oversight of Privacy in Language Model Agent

Zhiping Zhang, Bingcan Guo, Tianshi Li

TL;DR

This study investigates how people oversee privacy implications when LM agents act on their behalf in asynchronous interpersonal tasks. Using a task-based online survey (N=300) across six PrivacyLens scenarios, the authors compare user drafts with agent-generated responses and quantify resulting privacy leakage, finding that many participants favor the agent’s leakage-heavy outputs, increasing observed harm from $15.7\%$ to up to $55.0\%$. They identify six distinct privacy-oversight patterns that reflect varying concerns, trust, and privacy preferences, revealing a substantial gap between subjective harms and objective privacy norms. The work highlights the need for privacy-preserving agent design and bidirectional alignment between user preferences and AI behavior, with implications for how to scaffold oversight and calibrate trust in agentic systems.

Abstract

Language model (LM) agents that act on users' behalf for personal tasks (e.g., replying emails) can boost productivity, but are also susceptible to unintended privacy leakage risks. We present the first study on people's capacity to oversee the privacy implications of the LM agents. By conducting a task-based survey ($N=300$), we investigate how people react to and assess the response generated by LM agents for asynchronous interpersonal communication tasks, compared with a response they wrote. We found that people may favor the agent response with more privacy leakage over the response they drafted or consider both good, leading to an increased harmful disclosure from 15.7% to 55.0%. We further identified six privacy behavior patterns reflecting varying concerns, trust levels, and privacy preferences underlying people's oversight of LM agents' actions. Our findings shed light on designing agentic systems that enable privacy-preserving interactions and achieve bidirectional alignment on privacy preferences to help users calibrate trust.

Privacy Leakage Overshadowed by Views of AI: A Study on Human Oversight of Privacy in Language Model Agent

TL;DR

This study investigates how people oversee privacy implications when LM agents act on their behalf in asynchronous interpersonal tasks. Using a task-based online survey (N=300) across six PrivacyLens scenarios, the authors compare user drafts with agent-generated responses and quantify resulting privacy leakage, finding that many participants favor the agent’s leakage-heavy outputs, increasing observed harm from to up to . They identify six distinct privacy-oversight patterns that reflect varying concerns, trust, and privacy preferences, revealing a substantial gap between subjective harms and objective privacy norms. The work highlights the need for privacy-preserving agent design and bidirectional alignment between user preferences and AI behavior, with implications for how to scaffold oversight and calibrate trust in agentic systems.

Abstract

Language model (LM) agents that act on users' behalf for personal tasks (e.g., replying emails) can boost productivity, but are also susceptible to unintended privacy leakage risks. We present the first study on people's capacity to oversee the privacy implications of the LM agents. By conducting a task-based survey (), we investigate how people react to and assess the response generated by LM agents for asynchronous interpersonal communication tasks, compared with a response they wrote. We found that people may favor the agent response with more privacy leakage over the response they drafted or consider both good, leading to an increased harmful disclosure from 15.7% to 55.0%. We further identified six privacy behavior patterns reflecting varying concerns, trust levels, and privacy preferences underlying people's oversight of LM agents' actions. Our findings shed light on designing agentic systems that enable privacy-preserving interactions and achieve bidirectional alignment on privacy preferences to help users calibrate trust.

Paper Structure

This paper contains 65 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Common use cases of LM agents organized by task domain and interaction type.
  • Figure 2: The average and sample standard deviation of participant's rated 7-point Likert scale harmfulness of pre-defined items for six scenarios. The responses shown here are the responses generated by the LM agents. The left side visualizes the average harmfulness of each information item, from Extremely Unharmful to Extremely Harmful. The right side visualizes the sample standard deviation of the harmfulness, with the darker gray indicates larger std.
  • Figure 3: Trust towards the LM agent before and after seeing the LM agent's response and being informed of privacy tuples across different patterns. Tested using Wilcoxon signed rank test with Bonferroni correction. ** indicates $p<0.01$, *** indicates $p<0.001$.
  • Figure 4: Participant's comfortableness of delegating different levels of tasks to the LM agent before and after seeing the agent's response and being informed of the privacy tuples. Black thick lines indicate medians. Tested using Wilcoxon signed rank test with Bonferroni correction. * indicates $p<0.05$, *** indicates $p<0.001$.
  • Figure 5: Six asynchronous interpersonal communication scenarios and tasks