Verified Foundations for Differential Privacy
Markus de Medeiros, Muhammad Naveed, Tancrède Lepoint, Temesghen Kahsai, Tristan Ravitch, Stefan Zetzsche, Anjali Joshi, Joseph Tassarotti, Aws Albarghouthi, Jean-Baptiste Tristan
TL;DR
SampCert delivers the first mechanically verified, executable foundation for differential privacy, implemented in Lean4 with over 12,000 lines of formal proof. It provides a generic, instantiable DP framework (including pure DP and zCDP), alongside formally verified discrete Laplace and Gaussian samplers to avoid floating-point pitfalls. A simple probabilistic language (SLang) and a robust loop-proof method enable rigorous verification and scalable extraction to real-world deployments, including AWS usage. By tightly integrating Mathlib and offering extraction paths (FFI and Dafny-to-Python), SampCert commits to both research rigor and practical impact in DP tooling and implementations.
Abstract
Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but implementing it correctly has proven challenging. Prior work has focused on verifying DP at a high level, assuming the foundations are correct and a perfect source of randomness is available. However, the underlying theory of differential privacy can be very complex and subtle. Flaws in basic mechanisms and random number generation have been a critical source of vulnerabilities in real-world DP systems. In this paper, we present SampCert, the first comprehensive, mechanized foundation for differential privacy. SampCert is written in Lean with over 12,000 lines of proof. It offers a generic and extensible notion of DP, a framework for constructing and composing DP mechanisms, and formally verified implementations of Laplace and Gaussian sampling algorithms. SampCert provides (1) a mechanized foundation for developing the next generation of differentially private algorithms, and (2) mechanically verified primitives that can be deployed in production systems. Indeed, SampCert's verified algorithms power the DP offerings of Amazon Web Services (AWS), demonstrating its real-world impact. SampCert's key innovations include: (1) A generic DP foundation that can be instantiated for various DP definitions (e.g., pure, concentrated, Rényi DP); (2) formally verified discrete Laplace and Gaussian sampling algorithms that avoid the pitfalls of floating-point implementations; and (3) a simple probability monad and novel proof techniques that streamline the formalization. To enable proving complex correctness properties of DP and random number generation, SampCert makes heavy use of Lean's extensive Mathlib library, leveraging theorems in Fourier analysis, measure and probability theory, number theory, and topology.
