Table of Contents
Fetching ...

Operationalizing Data Minimization for Privacy-Preserving LLM Prompting

Jijie Zhou, Niloofar Mireshghallah, Tianshi Li

TL;DR

This work formalizes data minimization for privacy-preserving LLM prompting by casting the problem as a privacy-utility optimization over a span-level transformation space and solving it with a priority-queue, two-stage Freeze-Then-Search algorithm. It instantiates an ordinal action lattice {RETAIN, ABSTRACT, REDACT}, defines a privacy comparator, and relies on a context-recovery step to evaluate utility, enabling a black-box, pre-inference approach compatible with diverse models. Empirical results across four datasets and nine models show frontier LLMs can achieve substantial privacy gains (e.g., up to 85.7% REDACT on open-ended prompts and up to 98.0% REDACT on closed-ended tasks) while maintaining task performance, but predicting the optimal minimization directly remains challenging due to abstraction biases and model-specific behaviors. The findings highlight a concrete privacy-utility frontier and reveal a capability gap: current models often overshare by default, underscoring the need for robust predictors and on-device data-minimization tooling to curb leakage in real-world deployments. The framework thus lays groundwork for both practical privacy-preserving prompting and further research into model-aware minimization strategies and human-AI collaboration in privacy decisions.

Abstract

The rapid deployment of large language models (LLMs) in consumer applications has led to frequent exchanges of personal information. To obtain useful responses, users often share more than necessary, increasing privacy risks via memorization, context-based personalization, or security breaches. We present a framework to formally define and operationalize data minimization: for a given user prompt and response model, quantifying the least privacy-revealing disclosure that maintains utility, and we propose a priority-queue tree search to locate this optimal point within a privacy-ordered transformation space. We evaluated the framework on four datasets spanning open-ended conversations (ShareGPT, WildChat) and knowledge-intensive tasks with single-ground-truth answers (CaseHold, MedQA), quantifying achievable data minimization with nine LLMs as the response model. Our results demonstrate that larger frontier LLMs can tolerate stronger data minimization while maintaining task quality than smaller open-source models (85.7% redaction for GPT-5 vs. 19.3% for Qwen2.5-0.5B). By comparing with our search-derived benchmarks, we find that LLMs struggle to predict optimal data minimization directly, showing a bias toward abstraction that leads to oversharing. This suggests not just a privacy gap, but a capability gap: models may lack awareness of what information they actually need to solve a task.

Operationalizing Data Minimization for Privacy-Preserving LLM Prompting

TL;DR

This work formalizes data minimization for privacy-preserving LLM prompting by casting the problem as a privacy-utility optimization over a span-level transformation space and solving it with a priority-queue, two-stage Freeze-Then-Search algorithm. It instantiates an ordinal action lattice {RETAIN, ABSTRACT, REDACT}, defines a privacy comparator, and relies on a context-recovery step to evaluate utility, enabling a black-box, pre-inference approach compatible with diverse models. Empirical results across four datasets and nine models show frontier LLMs can achieve substantial privacy gains (e.g., up to 85.7% REDACT on open-ended prompts and up to 98.0% REDACT on closed-ended tasks) while maintaining task performance, but predicting the optimal minimization directly remains challenging due to abstraction biases and model-specific behaviors. The findings highlight a concrete privacy-utility frontier and reveal a capability gap: current models often overshare by default, underscoring the need for robust predictors and on-device data-minimization tooling to curb leakage in real-world deployments. The framework thus lays groundwork for both practical privacy-preserving prompting and further research into model-aware minimization strategies and human-AI collaboration in privacy decisions.

Abstract

The rapid deployment of large language models (LLMs) in consumer applications has led to frequent exchanges of personal information. To obtain useful responses, users often share more than necessary, increasing privacy risks via memorization, context-based personalization, or security breaches. We present a framework to formally define and operationalize data minimization: for a given user prompt and response model, quantifying the least privacy-revealing disclosure that maintains utility, and we propose a priority-queue tree search to locate this optimal point within a privacy-ordered transformation space. We evaluated the framework on four datasets spanning open-ended conversations (ShareGPT, WildChat) and knowledge-intensive tasks with single-ground-truth answers (CaseHold, MedQA), quantifying achievable data minimization with nine LLMs as the response model. Our results demonstrate that larger frontier LLMs can tolerate stronger data minimization while maintaining task quality than smaller open-source models (85.7% redaction for GPT-5 vs. 19.3% for Qwen2.5-0.5B). By comparing with our search-derived benchmarks, we find that LLMs struggle to predict optimal data minimization directly, showing a bias toward abstraction that leads to oversharing. This suggests not just a privacy gap, but a capability gap: models may lack awareness of what information they actually need to solve a task.

Paper Structure

This paper contains 42 sections, 1 equation, 3 figures, 83 tables, 1 algorithm.

Figures (3)

  • Figure 1: Framework Overview. We present a running example to demonstrate how we perform a tree search ranked by privacy variants, and a transformation that achieves data minimization.
  • Figure 2: Oracle vs. Prediction REDACT and ABSTRACT Ratio.
  • Figure 3: Prediction vs. oracle minimization across datasets. Each panel shows per-model stacked proportions that sum to $1$. Outcomes are interpreted relative to the gpt-5 oracle using our privacy comparator (Sec. \ref{['sec:comparator-io']}) and utility predicate (Sec. \ref{['sec:e2e']}): Overshare—the prediction disclosure is less privacy-preserving than the oracle; Undershare+FAIL—the prediction hides more but fails the utility check; Undershare+PASS—the prediction hides more and passes utility; and Fit—the prediction ties the oracle on privacy and passes utility.