Table of Contents
Fetching ...

ConfusionPrompt: Practical Private Inference for Online Large Language Models

Peihua Mai, Youjia Yang, Ran Yan, Rui Ye, Yan Pang

TL;DR

ConfusionPrompt is introduced, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts, and generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM.

Abstract

State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers. This raises significant privacy concerns. In response, we introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by: (i) decomposing the original prompt into smaller sub-prompts, and (ii) generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM. The server responses are later recomposed by the user to reconstruct the final output. This approach offers key advantages over previous LLM privacy protection methods: (i) it integrates seamlessly with existing black-box LLMs, and (ii) it delivers a significantly improved privacy-utility trade-off compared to existing text perturbation methods. We also develop a $(λ, μ, ρ)$-privacy model to formulate the requirements for a privacy-preserving group of prompts and provide a complexity analysis to justify the role of prompt decomposition. Our empirical evaluation shows that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques, while also reducing memory consumption compared to open-source LLMs.

ConfusionPrompt: Practical Private Inference for Online Large Language Models

TL;DR

ConfusionPrompt is introduced, a novel framework for private LLM inference that protects user privacy by decomposing the original prompt into smaller sub-prompts, and generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM.

Abstract

State-of-the-art large language models (LLMs) are typically deployed as online services, requiring users to transmit detailed prompts to cloud servers. This raises significant privacy concerns. In response, we introduce ConfusionPrompt, a novel framework for private LLM inference that protects user privacy by: (i) decomposing the original prompt into smaller sub-prompts, and (ii) generating pseudo-prompts alongside the genuine sub-prompts, which are then sent to the LLM. The server responses are later recomposed by the user to reconstruct the final output. This approach offers key advantages over previous LLM privacy protection methods: (i) it integrates seamlessly with existing black-box LLMs, and (ii) it delivers a significantly improved privacy-utility trade-off compared to existing text perturbation methods. We also develop a -privacy model to formulate the requirements for a privacy-preserving group of prompts and provide a complexity analysis to justify the role of prompt decomposition. Our empirical evaluation shows that ConfusionPrompt achieves significantly higher utility than local inference methods using open-source models and perturbation-based techniques, while also reducing memory consumption compared to open-source LLMs.
Paper Structure (36 sections, 4 theorems, 16 equations, 5 figures, 10 tables, 1 algorithm)

This paper contains 36 sections, 4 theorems, 16 equations, 5 figures, 10 tables, 1 algorithm.

Key Result

Theorem 11

Let $\boldsymbol{P}=\{\boldsymbol{p}_0, \boldsymbol{p}_1, ..., \boldsymbol{p}_n\}$ be a group of prompts, where $\boldsymbol{p}_0$ is the genuine user prompt and the remaining ones are dummy prompts. Suppose each prompt in the group represents a single paragraph $|\boldsymbol{p}_i|=1$, $\forall i \i

Figures (5)

  • Figure 1: Overview of ConfusionPrompt.
  • Figure 2: Example of decomposition savings in query complexity. Decomposition module reduces the query complexity from 9 to 6 under privacy requirement $\mu=3$. This reduction becomes more pronounced with larger values of $\mu$. For instance, the complexity decreases from 100 to 20 under privacy requirement $\mu=10$.
  • Figure 3: Prompt identification attack accuracy under various combinations of privacy parameters.
  • Figure 4: Attribute inference attack accuracy for ConfusionPrompt and LDP-based methods.
  • Figure 5: Monetary ratio of strategyQA and MuSiQue dataset before (decomp) and after (w/o decomp) decompositon under various significance $\mu$. Decompostion in ConfusionPrompt substantially reduces the monetary cost, indicating its efficiency.

Theorems & Definitions (17)

  • Definition 1: Private Attributes
  • Remark 2
  • Definition 3: Attribute-attribute Similarity
  • Definition 4: Correspondent Attributes
  • Remark 5
  • Definition 6: Prompt-prompt Similarity
  • Definition 7: Significance of Single Attribute
  • Definition 8: Significance of Attribute Set
  • Definition 9: Genuineness
  • Definition 10: User Privacy
  • ...and 7 more