Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs

Ya-Ting Yang; Quanyan Zhu

Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs

Ya-Ting Yang, Quanyan Zhu

Abstract

Large language models (LLMs) and AI agents are increasingly integrated into enterprise systems to access internal databases and generate context-aware responses. While such integration improves productivity and decision support, the model outputs may inadvertently reveal sensitive information. Although many prior efforts focus on protecting the privacy of user prompts, relatively few studies consider privacy risks from the enterprise data perspective. Hence, this paper develops a probabilistic framework for analyzing privacy leakage in AI agents based on differential privacy. We model response generation as a stochastic mechanism that maps prompts and datasets to distributions over token sequences. Within this framework, we introduce token-level and message-level differential privacy and derive privacy bounds that relate privacy leakage to generation parameters such as temperature and message length. We further formulate a privacy-utility design problem that characterizes optimal temperature selection.

Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs

Abstract

Paper Structure (20 sections, 5 theorems, 13 equations, 2 figures)

This paper contains 20 sections, 5 theorems, 13 equations, 2 figures.

Introduction
Literature Review
Enterprise Guardrails and Secure Architectures
Differential Privacy for Language Models
Generative Message Model
Token and Message Spaces
Prompts and Information Sets
Token-Level Generative Mechanism
Temperature and Length of the Message
Differential Privacy of LLM-Agent
Message-Level Differential Privacy
Token-Level Differential Privacy
Optimal DP Design of LLM-Agent
Message-Level Utility
Optimal Privacy-Utility Tradeoff
...and 5 more sections

Key Result

Lemma 1

If the token generation mechanism at each step $k$ satisfies $(\varepsilon_k,\delta_k)$-DP in Definition def:token_DP, then the induced message-generation mechanism $\mathcal{M}_i : (p_i^t,D_i^t,I_i^t) \to \mathcal{X}_i$ satisfies $(\varepsilon,\delta)$-DP in Definition def:message_DP at the message

Figures (2)

Figure 1: Privacy leakage under different temperature. Lines show the means while shaded areas indicate the standard deviations.
Figure 2: The quantities in the proposed privacy–utility framework under different temperatures.

Theorems & Definitions (13)

Definition 1: Message Space
Definition 2: Message Differential Privacy
Definition 3: Token Differential Privacy
Lemma 1
proof
Proposition 1: Token-Level Privacy Bound
proof
Corollary 1: Message-Level Privacy Bound
proof
Proposition 2: Derivative and Monotonicity
...and 3 more

Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs

Abstract

Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs

Authors

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (13)