Table of Contents
Fetching ...

Differentially Private Next-Token Prediction of Large Language Models

James Flemings, Meisam Razaviyayn, Murali Annavaram

TL;DR

Private Mixing of Ensemble Distributions (PMixED) is presented, a private prediction protocol for next-token prediction that utilizes the inherent stochasticity of next-token sampling and a public model to achieve Differential Privacy.

Abstract

Ensuring the privacy of Large Language Models (LLMs) is becoming increasingly important. The most widely adopted technique to accomplish this is DP-SGD, which trains a model to guarantee Differential Privacy (DP). However, DP-SGD overestimates an adversary's capabilities in having white box access to the model and, as a result, causes longer training times and larger memory usage than SGD. On the other hand, commercial LLM deployments are predominantly cloud-based; hence, adversarial access to LLMs is black-box. Motivated by these observations, we present Private Mixing of Ensemble Distributions (PMixED): a private prediction protocol for next-token prediction that utilizes the inherent stochasticity of next-token sampling and a public model to achieve Differential Privacy. We formalize this by introducing RD-mollifers which project each of the model's output distribution from an ensemble of fine-tuned LLMs onto a set around a public LLM's output distribution, then average the projected distributions and sample from it. Unlike DP-SGD which needs to consider the model architecture during training, PMixED is model agnostic, which makes PMixED a very appealing solution for current deployments. Our results show that PMixED achieves a stronger privacy guarantee than sample-level privacy and outperforms DP-SGD for privacy $ε= 8$ on large-scale datasets. Thus, PMixED offers a practical alternative to DP training methods for achieving strong generative utility without compromising privacy.

Differentially Private Next-Token Prediction of Large Language Models

TL;DR

Private Mixing of Ensemble Distributions (PMixED) is presented, a private prediction protocol for next-token prediction that utilizes the inherent stochasticity of next-token sampling and a public model to achieve Differential Privacy.

Abstract

Ensuring the privacy of Large Language Models (LLMs) is becoming increasingly important. The most widely adopted technique to accomplish this is DP-SGD, which trains a model to guarantee Differential Privacy (DP). However, DP-SGD overestimates an adversary's capabilities in having white box access to the model and, as a result, causes longer training times and larger memory usage than SGD. On the other hand, commercial LLM deployments are predominantly cloud-based; hence, adversarial access to LLMs is black-box. Motivated by these observations, we present Private Mixing of Ensemble Distributions (PMixED): a private prediction protocol for next-token prediction that utilizes the inherent stochasticity of next-token sampling and a public model to achieve Differential Privacy. We formalize this by introducing RD-mollifers which project each of the model's output distribution from an ensemble of fine-tuned LLMs onto a set around a public LLM's output distribution, then average the projected distributions and sample from it. Unlike DP-SGD which needs to consider the model architecture during training, PMixED is model agnostic, which makes PMixED a very appealing solution for current deployments. Our results show that PMixED achieves a stronger privacy guarantee than sample-level privacy and outperforms DP-SGD for privacy on large-scale datasets. Thus, PMixED offers a practical alternative to DP training methods for achieving strong generative utility without compromising privacy.
Paper Structure (26 sections, 11 theorems, 21 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 26 sections, 11 theorems, 21 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Lemma 4.1

If $\mathcal{M}_{(\alpha, \beta), Q_{0}}$ is an $(\alpha, \beta)$-RD Mollifier relative to $Q_{0}$, then $\mathcal{M}_{(\alpha, \beta), Q_{0}}$ is an $(\alpha, 4\beta)$-RD Mollifer.

Figures (5)

  • Figure 1: A brief overview of PMixED, which can be broken down into two phases. In Phase-I, the private dataset $D$ is partitioned into $N$ pairwise disjoint subsets $D_1, ..., D_N$, each of which $D_i$ is fine-tuned with a pre-trained LLM to produce $p_{i}$. Afterward, PMixED performs private predictions in Phase-II which can be further broken down into two steps. In Step 1, which we call mixing, a query $\mathbf{x}_t$ from a user is received at time $1 \leq t \leq T$. First, PMixED subsamples a subset of the ensemble, then generates the output distribution of each selected model $p_{i}(\mathbf{x}_t)$ and the output distribution of a public model $p_0(\mathbf{x}_t)$. Each $p_i(\mathbf{x}_t)$ is projected along a Renyi Divergence ball centered at the output distribution $p_{0}(\mathbf{x}_t)$ with radius $\beta\alpha$ to produce $\overline{p}_{i}(\mathbf{x}_t)$, which is a mixture of private $p_i(\mathbf{x}_t)$ and public $p_0(\mathbf{x}_t)$ information. In Step 2, all projected distributions are averaged into $p(\mathbf{x}_t)$ then sampled $y_t \sim p(\mathbf{x}_t)$.
  • Figure 2: Left: Projecting a distribution $P$ (blue curve) onto an $(\alpha, \beta)$-RD mollifier (black curve). The dotted line represents the maximum divergence $\beta\alpha$ of the mollifier. Note how the projected distribution maintains the same modes as $P$. Right:$Q'$ is the maximized projection of $Q$ onto a relative RD-mollifier around $Q_{0}$, which diverges by at most $\beta \alpha$ from $Q_{0}$.
  • Figure 3: Comparison of PMixED against 3 baselines on WikiText-103 and One Billion Word using GPT-2.
  • Figure 4: Ablation study on DP hyperparameters using WikiText-103. The x-axis is the hyperparameter space, and the y-axis is the perplexity score.
  • Figure 5: Alpha

Theorems & Definitions (18)

  • Definition 3.1: Approximate DP dwork2014algorithmic
  • Definition 3.2: Renyi Divergence mironov2017renyi
  • Definition 3.3: $(\alpha, \epsilon)$-RDPmironov2017renyi
  • Definition 3.4: Privacy-Preserving Prediction
  • Definition 4.1: $(\alpha, \beta)$-RD Mollifier
  • Definition 4.2: $(\alpha, \beta)$-RD Mollifier relative to $Q_{0}$
  • Lemma 4.1
  • Theorem 5.1
  • Theorem 5.2
  • Definition B.1: $(\alpha, \beta)$-RD private sampler
  • ...and 8 more