Table of Contents
Fetching ...

Locally Differentially Private In-Context Learning

Chunyan Zheng, Keke Sun, Wenhao Zhao, Haibo Zhou, Lixin Jiang, Shaoyang Song, Chunlai Zhou

TL;DR

This work addresses privacy risks in in-context learning by LLMs when demonstrations may contain sensitive labels. It introduces LDP-ICL, a locally differentially private framework that perturbs demonstration labels with a $k$-ary randomized response and analyzes the resulting privacy–utility trade-off via a gradient-descent perspective on ICL. The authors derive a test-prediction formula under noisy demonstrations, connect ICL to linear self-attention, and extend the framework to discrete distribution estimation. Empirical results across multiple datasets show that LDP-ICL can closely match non-private ICL for moderate privacy budgets while providing strong privacy guarantees, and that it can outperform Warner’s mechanism in high-privacy regimes for distribution estimation. The work also provides templates, proofs, and extensive ablations to support the proposed approach and outlines future directions for refining demonstration selection and privatizing additional components of the ICL pipeline.

Abstract

Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference attacks (MIA) and prompt leaking attacks. In order to deal with this problem, we treat LLMs as untrusted in privacy and propose a locally differentially private framework of in-context learning(LDP-ICL) in the settings where labels are sensitive. Considering the mechanisms of in-context learning in Transformers by gradient descent, we provide an analysis of the trade-off between privacy and utility in such LDP-ICL for classification. Moreover, we apply LDP-ICL to the discrete distribution estimation problem. In the end, we perform several experiments to demonstrate our analysis results.

Locally Differentially Private In-Context Learning

TL;DR

This work addresses privacy risks in in-context learning by LLMs when demonstrations may contain sensitive labels. It introduces LDP-ICL, a locally differentially private framework that perturbs demonstration labels with a -ary randomized response and analyzes the resulting privacy–utility trade-off via a gradient-descent perspective on ICL. The authors derive a test-prediction formula under noisy demonstrations, connect ICL to linear self-attention, and extend the framework to discrete distribution estimation. Empirical results across multiple datasets show that LDP-ICL can closely match non-private ICL for moderate privacy budgets while providing strong privacy guarantees, and that it can outperform Warner’s mechanism in high-privacy regimes for distribution estimation. The work also provides templates, proofs, and extensive ablations to support the proposed approach and outlines future directions for refining demonstration selection and privatizing additional components of the ICL pipeline.

Abstract

Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference attacks (MIA) and prompt leaking attacks. In order to deal with this problem, we treat LLMs as untrusted in privacy and propose a locally differentially private framework of in-context learning(LDP-ICL) in the settings where labels are sensitive. Considering the mechanisms of in-context learning in Transformers by gradient descent, we provide an analysis of the trade-off between privacy and utility in such LDP-ICL for classification. Moreover, we apply LDP-ICL to the discrete distribution estimation problem. In the end, we perform several experiments to demonstrate our analysis results.
Paper Structure (31 sections, 1 theorem, 15 equations, 6 figures, 11 tables, 2 algorithms)

This paper contains 31 sections, 1 theorem, 15 equations, 6 figures, 11 tables, 2 algorithms.

Key Result

Proposition 3.1

Given previous token: $\mathcal{E}_n=\{\left(\boldsymbol{x_i},y_i\right)\}_{i=1}^n$, we can construct key, query and value matrices $\boldsymbol{W_k}$, $\boldsymbol{W_q}$, $\boldsymbol{W_v}$ as well as the projection matrix $\boldsymbol{P}$ such that a 1-head linear attention operation on the matrix

Figures (6)

  • Figure 1: The framework of LDP-ICL: We first sample a few input-label pairs from the original private database to form the demonstration set. Next we employ the $k$-ary randomized response mechanism $Q_{k\text{-RR}}$ to perturb the labels and then perform the ICL with a given query $\boldsymbol{x_{\text{test}}}$ prepended by the noisy demonstration set. At the end, the response is returned to the adversary.
  • Figure 2: Obfuscation in labels, where $p=\frac{e^{\epsilon}}{e^{\epsilon}+1}$
  • Figure 3: Classification scenario: Test performance on (a)SST-2, (b)Subj, (c)Ethos and (d) SMS_Spam
  • Figure 4: Distribution estimation scenario: Estimation results on (a)SST-2 and (b)Ethos.
  • Figure 5: Performance across numbers of the examples
  • ...and 1 more figures

Theorems & Definitions (2)

  • Proposition 3.1
  • Definition 3.2