Privacy-Preserving In-Context Learning for Large Language Models
Tong Wu, Ashwinee Panda, Jiachen T. Wang, Prateek Mittal
TL;DR
Addresses privacy leakage in in-context learning (ICL) for large language models by a general Differentially Private In-context Learning (DP-ICL) paradigm that privatizes ICL responses via a noisy consensus over disjoint exemplar sets. It provides a concrete algorithm for private top-$k$ releases on token histograms using $d_k = H_{(k)} - H_{(k+1)}$ and a noisy threshold, linking the approach to the exponential mechanism and Rényi-DP guarantees. The analysis includes privacy amplification by subsampling and limited-domain Renyi-DP bounds, establishing robust privacy protections. Empirically, DP-ICL demonstrates a strong utility-privacy tradeoff on multiple text classification and generation benchmarks.
Abstract
In-context learning (ICL) is an important capability of Large Language Models (LLMs), enabling these models to dynamically adapt based on specific, in-context exemplars, thereby improving accuracy and relevance. However, LLM's responses may leak the sensitive private information contained in in-context exemplars. To address this challenge, we propose Differentially Private In-context Learning (DP-ICL), a general paradigm for privatizing ICL tasks. The key idea for DP-ICL paradigm is generating differentially private responses through a noisy consensus among an ensemble of LLM's responses based on disjoint exemplar sets. Based on the general paradigm of DP-ICL, we instantiate several techniques showing how to privatize ICL for text classification and language generation. We evaluate DP-ICL on four text classification benchmarks and two language generation tasks, and our empirical results show that DP-ICL achieves a strong utility-privacy tradeoff.
