Table of Contents
Fetching ...

Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences

Guillem Ramírez, Alexandra Birch, Ivan Titov

TL;DR

The paper tackles privacy risks in API-driven LLM usage by introducing privacy profiles that let users specify what can be shared with external models. A two-tiered system with a local model paraphrases user queries, which are then optionally answered by a more capable external LLM, with an aggregator reconciling outputs. The authors present PEEP, a multilingual dataset of 15,282 real queries with synthetic privacy profiles, and show that fine-tuned lightweight LLMs can match or surpass larger zero-shot models in both privacy protection and task performance, though some leakage persists. The work highlights practical pathways for privacy-preserving LLM use and emphasizes the need for better instruction understanding of user-defined privacy preferences and further reductions in leakage in real-world deployments.

Abstract

Large language models (LLMs) are primarily accessed via commercial APIs, but this often requires users to expose their data to service providers. In this paper, we explore how users can stay in control of their data by using privacy profiles: simple natural language instructions that say what should and should not be revealed. We build a framework where a local model uses these instructions to rewrite queries, only hiding details deemed sensitive by the user, before sending them to an external model, thus balancing privacy with performance. To support this research, we introduce PEEP, a multilingual dataset of real user queries annotated to mark private content and paired with synthetic privacy profiles. Experiments with lightweight local LLMs show that, after fine-tuning, they not only achieve markedly better privacy preservation but also match or exceed the performance of much larger zero-shot models. At the same time, the system still faces challenges in fully adhering to user instructions, underscoring the need for models with a better understanding of user-defined privacy preferences.

Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences

TL;DR

The paper tackles privacy risks in API-driven LLM usage by introducing privacy profiles that let users specify what can be shared with external models. A two-tiered system with a local model paraphrases user queries, which are then optionally answered by a more capable external LLM, with an aggregator reconciling outputs. The authors present PEEP, a multilingual dataset of 15,282 real queries with synthetic privacy profiles, and show that fine-tuned lightweight LLMs can match or surpass larger zero-shot models in both privacy protection and task performance, though some leakage persists. The work highlights practical pathways for privacy-preserving LLM use and emphasizes the need for better instruction understanding of user-defined privacy preferences and further reductions in leakage in real-world deployments.

Abstract

Large language models (LLMs) are primarily accessed via commercial APIs, but this often requires users to expose their data to service providers. In this paper, we explore how users can stay in control of their data by using privacy profiles: simple natural language instructions that say what should and should not be revealed. We build a framework where a local model uses these instructions to rewrite queries, only hiding details deemed sensitive by the user, before sending them to an external model, thus balancing privacy with performance. To support this research, we introduce PEEP, a multilingual dataset of real user queries annotated to mark private content and paired with synthetic privacy profiles. Experiments with lightweight local LLMs show that, after fine-tuning, they not only achieve markedly better privacy preservation but also match or exceed the performance of much larger zero-shot models. At the same time, the system still faces challenges in fully adhering to user instructions, underscoring the need for models with a better understanding of user-defined privacy preferences.

Paper Structure

This paper contains 78 sections, 2 equations, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Scheme of our pipeline for privacy-conscious query delegation. A local LLM (purple boxes) receives a request from a user, along with some privacy specifications. If the query can be paraphrased safely, we send the paraphrase to an external, untrusted LLM (green box). Finally, the local model integrates the response.
  • Figure 2: On the left, the difficulty gap (difference in $\text{Leak}_{\text{PRO}}$ for hard and easy information) before and after FT. On the right, the context gap (difference in $\text{Leak}_{\text{PRO}}$ for appropriate and unappropriate). We observe that FT most consistently reduces the context gap.
  • Figure 3: Histogram of the number of personal details identified for each user.
  • Figure 4: Most common combinations of information attributes extracted. We have omitted types languages, occupations and connections.
  • Figure 5: Number of sensitive attributes per user (Q3).