Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
Jijie Zhou, Niloofar Mireshghallah, Tianshi Li
TL;DR
This work formalizes data minimization for privacy-preserving LLM prompting by casting the problem as a privacy-utility optimization over a span-level transformation space and solving it with a priority-queue, two-stage Freeze-Then-Search algorithm. It instantiates an ordinal action lattice {RETAIN, ABSTRACT, REDACT}, defines a privacy comparator, and relies on a context-recovery step to evaluate utility, enabling a black-box, pre-inference approach compatible with diverse models. Empirical results across four datasets and nine models show frontier LLMs can achieve substantial privacy gains (e.g., up to 85.7% REDACT on open-ended prompts and up to 98.0% REDACT on closed-ended tasks) while maintaining task performance, but predicting the optimal minimization directly remains challenging due to abstraction biases and model-specific behaviors. The findings highlight a concrete privacy-utility frontier and reveal a capability gap: current models often overshare by default, underscoring the need for robust predictors and on-device data-minimization tooling to curb leakage in real-world deployments. The framework thus lays groundwork for both practical privacy-preserving prompting and further research into model-aware minimization strategies and human-AI collaboration in privacy decisions.
Abstract
The rapid deployment of large language models (LLMs) in consumer applications has led to frequent exchanges of personal information. To obtain useful responses, users often share more than necessary, increasing privacy risks via memorization, context-based personalization, or security breaches. We present a framework to formally define and operationalize data minimization: for a given user prompt and response model, quantifying the least privacy-revealing disclosure that maintains utility, and we propose a priority-queue tree search to locate this optimal point within a privacy-ordered transformation space. We evaluated the framework on four datasets spanning open-ended conversations (ShareGPT, WildChat) and knowledge-intensive tasks with single-ground-truth answers (CaseHold, MedQA), quantifying achievable data minimization with nine LLMs as the response model. Our results demonstrate that larger frontier LLMs can tolerate stronger data minimization while maintaining task quality than smaller open-source models (85.7% redaction for GPT-5 vs. 19.3% for Qwen2.5-0.5B). By comparing with our search-derived benchmarks, we find that LLMs struggle to predict optimal data minimization directly, showing a bias toward abstraction that leads to oversharing. This suggests not just a privacy gap, but a capability gap: models may lack awareness of what information they actually need to solve a task.
