Table of Contents
Fetching ...

Privacy Reasoning in Ambiguous Contexts

Ren Yi, Octavian Suciu, Adria Gascon, Sarah Meiklejohn, Eugene Bagdasarian, Marco Gruteser

TL;DR

This work addresses how large language models reason about appropriate information disclosure under context ambiguity, a key barrier to practical agentic privacy. It introduces Camber, a context disambiguation framework with label-independent, label-dependent, and reasoning-guided expansions, to systematically clarify ambiguous scenarios. Across PrivacyLens+ and ConfAIde+ datasets, Camber yields significant improvements in precision and recall (up to 13.3% and 22.3%, respectively) and substantially reduces prompt sensitivity, supported by entropy analyses from both model outputs and human judgments. The findings suggest that explicit, reasoning-informed context clarification can greatly enhance privacy reasoning in production-ready agents, with important implications for designing user-interactive clarification mechanisms.

Abstract

We study the ability of language models to reason about appropriate information disclosure - a central aspect of the evolving field of agentic privacy. Whereas previous works have focused on evaluating a model's ability to align with human decisions, we examine the role of ambiguity and missing context on model performance when making information-sharing decisions. We identify context ambiguity as a crucial barrier for high performance in privacy assessments. By designing Camber, a framework for context disambiguation, we show that model-generated decision rationales can reveal ambiguities and that systematically disambiguating context based on these rationales leads to significant accuracy improvements (up to 13.3% in precision and up to 22.3% in recall) as well as reductions in prompt sensitivity. Overall, our results indicate that approaches for context disambiguation are a promising way forward to enhance agentic privacy reasoning.

Privacy Reasoning in Ambiguous Contexts

TL;DR

This work addresses how large language models reason about appropriate information disclosure under context ambiguity, a key barrier to practical agentic privacy. It introduces Camber, a context disambiguation framework with label-independent, label-dependent, and reasoning-guided expansions, to systematically clarify ambiguous scenarios. Across PrivacyLens+ and ConfAIde+ datasets, Camber yields significant improvements in precision and recall (up to 13.3% and 22.3%, respectively) and substantially reduces prompt sensitivity, supported by entropy analyses from both model outputs and human judgments. The findings suggest that explicit, reasoning-informed context clarification can greatly enhance privacy reasoning in production-ready agents, with important implications for designing user-interactive clarification mechanisms.

Abstract

We study the ability of language models to reason about appropriate information disclosure - a central aspect of the evolving field of agentic privacy. Whereas previous works have focused on evaluating a model's ability to align with human decisions, we examine the role of ambiguity and missing context on model performance when making information-sharing decisions. We identify context ambiguity as a crucial barrier for high performance in privacy assessments. By designing Camber, a framework for context disambiguation, we show that model-generated decision rationales can reveal ambiguities and that systematically disambiguating context based on these rationales leads to significant accuracy improvements (up to 13.3% in precision and up to 22.3% in recall) as well as reductions in prompt sensitivity. Overall, our results indicate that approaches for context disambiguation are a promising way forward to enhance agentic privacy reasoning.

Paper Structure

This paper contains 56 sections, 15 figures, 8 tables.

Figures (15)

  • Figure 1: Left: The Camber framework studies the impact of ambiguity on privacy judgments and highlights the benefits of resolving it through various disambiguation strategies. Right: Camber aims to inform the design of context clarification mechanisms for future agents completing personalized privacy tasks on behalf of users.
  • Figure 2: Underspecified example misclassified by Gemini 2.5 Pro as appropriate. When asked for reasoning, the model generates plausible assumptions about the communication channel and company protocols.
  • Figure 3: The three disambiguation strategies implemented in Camber -- ① label-independent, ② label-dependent, and ③ reasoning-guided -- demonstrated with a PrivacyLens+ example.
  • Figure 4: Reasoning-guided expansion results in significant performance gains and reduction in prompt sensitivity over all other expansions. Figure shows $F_1$ scores for reasoning-guided expansion (Reasoning Guided) compared to those of no expansion (No Exp.), label-independent expansion (Label Indep.) and label-dependent expansion (Label Dep.) across PrivacyLens+ (left) and ConfAIde+ (right) datasets, across 3 prompt variants and across Gemini 2.5 Pro, GPT 4.1, and Claude 3.7 Sonnet models. For each of the three expansion strategies, we report the average $F_1$ scores across all fields / codes (Average over fields & codes), and the $F_1$ scores for the top-performing field / code (Top performing field & code). The top-performing fields for PrivacyLens+ are identical across all models: data type for label-independent, transmission principle for label-dependent, and consent for reasoning-guided expansion. For the ConfAIde+ dataset, the results are as follows: for label-dependent expansion, all models use subject agent; for reasoning-guided, Gemini and Claude use consent while GPT uses sender authorization; and for label-independent, the choices are aware agent (Gemini), subject agent (GPT), and aware agent relation (Claude). Error bars show 95% confidence intervals generated by bootstrapping the experiment results 1,000 times with replacement.
  • Figure 5: The entropy scores of the LLM judgments, for no expansion examples (No Exp) and reasoning-guided expansion examples across both datasets. For each example, the entropy score is computed from the probability distribution of 100 privacy judgments generated by Gemini 2.5 Pro with temperature 1.0.
  • ...and 10 more figures