Resampling methods for private statistical inference
Karan Chadha, John Duchi, Rohith Kuditipudi
TL;DR
This work addresses the problem of constructing valid confidence sets under differential privacy for general estimators. It develops two private variants of the non-parametric bootstrap that privately aggregate results from many little bootstraps via median-like mechanisms, enabling percentile and normal-approximation confidence intervals with asymptotic guarantees. Under mild regularity conditions and typical Edgeworth-expansion assumptions, the private methods attain coverage $1 - \alpha$ up to a $\widetilde{O}(n^{-1/2})$ error and achieve substantially shorter interval widths than prior private methods in mean, median, and logistic regression tasks. The methods rely on private median aggregation to improve stability and privacy, and they extend to objective-perturbation-based M-estimation with subsampling, offering a practical framework for privacy-preserving uncertainty quantification across a broad class of estimators.
Abstract
We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data and give asymptotic bounds on the coverage error of the resulting confidence intervals. For a fixed differential privacy parameter $ε$, our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$. We empirically validate the performance of our methods for mean estimation, median estimation, and logistic regression with both real and synthetic data. Our methods achieve similar coverage accuracy to existing methods (and non-private baselines) while providing notably shorter ($\gtrsim 10$ times) confidence intervals than previous approaches.
