Differentially Private Distribution Estimation Using Functional Approximation
Ye Tao, Anand D. Sarwate
TL;DR
This work tackles private CDF estimation by projecting the empirical CDF into a finite Legendre polynomial space and privatizing the projection coefficients via the functional mechanism, achieving $(\epsilon,\delta)$-DP. The core contribution, Polynomial Projection (PP), provides a principled balance between accuracy and privacy, with explicit $L_2$ error bounds and favorable behavior in decentralized and incremental-data scenarios. Empirical results show PP is competitive with adaptive quantiles in moderate privacy and outperforms histogram queries, while offering practical advantages for distributed settings and privacy-preserving visualizations. The approach opens avenues for alternative function spaces and high-dimensional extensions, along with a deeper examination of post-processing utilities.
Abstract
The cumulative distribution function (CDF) is fundamental due to its ability to reveal information about random variables, making it essential in studies that require privacy-preserving methods to protect sensitive data. This paper introduces a novel privacy-preserving CDF method inspired by the functional analysis and functional mechanism. Our approach projects the empirical CDF into a predefined space, approximating it using specific functions, and protects the coefficients to achieve a differentially private empirical CDF. Compared to existing methods like histogram queries and adaptive quantiles, our method is preferable in decentralized settings and scenarios where CDFs must be updated with newly collected data.
