Operationalizing Contextual Integrity in Privacy-Conscious Assistants

Sahra Ghalebikesabi; Eugene Bagdasaryan; Ren Yi; Itay Yona; Ilia Shumailov; Aneesh Pappu; Chongyang Shi; Laura Weidinger; Robert Stanforth; Leonard Berrada; Pushmeet Kohli; Po-Sen Huang; Borja Balle

Operationalizing Contextual Integrity in Privacy-Conscious Assistants

Sahra Ghalebikesabi, Eugene Bagdasaryan, Ren Yi, Itay Yona, Ilia Shumailov, Aneesh Pappu, Chongyang Shi, Laura Weidinger, Robert Stanforth, Leonard Berrada, Pushmeet Kohli, Po-Sen Huang, Borja Balle

TL;DR

We formalize the privacy problem in autonomous information-sharing AI assistants by framing it as context-aware information flows. The approach hinges on contextual integrity (CI), operationalized through Information Flow Cards (IFCs) and CI-based reasoning to approve or withhold data sharing, with task utility $U$ and privacy leakage $PL$ as core metrics. A novel form-filling benchmark with synthetic personas and human annotations quantifies these metrics, showing that CI-based reasoning improves privacy without sacrificing performance. The results demonstrate robustness to phrasing and model size, highlighting CI-based supervision as a practical path toward privacy-conscious AI assistants.

Abstract

Advanced AI assistants combine frontier LLMs and tool access to autonomously perform complex tasks on behalf of users. While the helpfulness of such assistants can increase dramatically with access to user information including emails and documents, this raises privacy concerns about assistants sharing inappropriate information with third parties without user supervision. To steer information-sharing assistants to behave in accordance with privacy expectations, we propose to operationalize contextual integrity (CI), a framework that equates privacy with the appropriate flow of information in a given context. In particular, we design and evaluate a number of strategies to steer assistants' information-sharing actions to be CI compliant. Our evaluation is based on a novel form filling benchmark composed of human annotations of common webform applications, and it reveals that prompting frontier LLMs to perform CI-based reasoning yields strong results.

Operationalizing Contextual Integrity in Privacy-Conscious Assistants

TL;DR

and privacy leakage

as core metrics. A novel form-filling benchmark with synthetic personas and human annotations quantifies these metrics, showing that CI-based reasoning improves privacy without sacrificing performance. The results demonstrate robustness to phrasing and model size, highlighting CI-based supervision as a practical path toward privacy-conscious AI assistants.

Abstract

Paper Structure (50 sections, 1 equation, 16 figures, 9 tables)

This paper contains 50 sections, 1 equation, 16 figures, 9 tables.

Introduction
CI in form-filling assistants as a first step towards general privacy-conscious AI assistants
Related work.
Design and Evaluation of Information-Sharing Assistants
Information-sharing assistants.
Privacy and utility.
Privacy and utility.
Simplifying assumptions.
Metrics against ground truth.
Designing Privacy-Conscious Information Sharing Assistants
Contextual integrity theory.
Assistant Designs
Self-censoring assistant.
Assistant with binary supervisor.
Assistant with reasoning supervisor.
...and 35 more sections

Figures (16)

Figure 1: The assistant operates autonomously by accessing personal data and filling in forms on behalf of the user. This particular assistant builds Information Flow Cards to decide on whether information is necessary to be shared given the task at hand as part of performing its assigned task.
Figure 2: Journey of an information-sharing action. Given a user query (1) the assistant retrieves user data (2), (optionally) communicates with the 3rd party to identify what information it requests and how it needs to be formatted (3), crafts a response based on the 3rd party's request and available user data (4), and sends the response (5).
Figure 3: The four types of assistant we consider differ in how they implement privacy judgements.
Figure 4: Ratio of "necessary" to "relevant" labels across different form applications F1-F14 per each rater. The form applications can be found in Supplementary Table \ref{['tab:preferences']}.
Figure 5: Example of our form filling query from our benchmark and how different assistants based on Gemini Ultra respond. We notice that in some cases (as here), the reasoning supervisor fails to provide a reasoning and only replies with its decision. Even with CoT the model output does not add any additional explanations. The CI-based supervisor provides more structured reasoning that can, for example, be used to interpret model failures.
...and 11 more figures

Operationalizing Contextual Integrity in Privacy-Conscious Assistants

TL;DR

Abstract

Operationalizing Contextual Integrity in Privacy-Conscious Assistants

Authors

TL;DR

Abstract

Table of Contents

Figures (16)