Composition of Differential Privacy & Privacy Amplification by Subsampling
Thomas Steinke
TL;DR
This chapter develops a unified framework for analyzing how privacy degrades under repeated DP analyses through the lens of privacy loss distributions (PLDs). It introduces concentrated DP (CDP) and Rényi DP (RDP) as refined tools that yield tighter composition than classical pure or approximate DP, and shows how Gaussian mechanisms naturally satisfy these notions. A central theme is privacy amplification by subsampling, with tight results for Poisson and fixed-size subsampling, and practical guidance for integrating PAS into iterative algorithms like SGD. The work connects fundamental bounds (e.g., advanced composition, optimal composition) to actionable accounting methods, enabling private data reuse while maintaining utility. Collectively, it provides rigorous techniques for privacy budgeting in complex, multi-round, or subsampled analyses used in AI applications.
Abstract
This chapter is meant to be part of the book "Differential Privacy for Artificial Intelligence Applications." We give an introduction to the most important property of differential privacy -- composition: running multiple independent analyses on the data of a set of people will still be differentially private as long as each of the analyses is private on its own -- as well as the related topic of privacy amplification by subsampling. This chapter introduces the basic concepts and gives proofs of the key results needed to apply these tools in practice.
