Private Linear Regression with Differential Privacy and PAC Privacy
Hillary Yang, Yuntao Du
TL;DR
This work analyzes privacy-preserving linear regression under two paradigms: differential privacy (DP) with budget $( ext{ε}, ext{δ})$ and PAC Privacy, which bounds the adversary's posterior via mutual information $MI$. It introduces PAC-LR, an anisotropic-noise based PAC privacy method that leverages SVD projections to reduce sensitivity and add noise directly to model weights, enabling a fair comparison with DPSGD-LR. Through experiments on three real-world datasets, PAC-LR frequently outperforms DPSGD-LR, particularly under stringent privacy guarantees, and the study highlights the critical roles of data normalization and regularization in both approaches. The results offer practical guidance on privacy-utility tradeoffs in private linear regression and point to future work in broadening DP method comparisons, improving sampling efficiency for PAC privacy, and further exploring the role of regularization in shaping anisotropic noise. In short, the paper demonstrates that PAC privacy can yield robust utility for linear models in settings with tight privacy constraints, with actionable techniques like anisotropic noise via SVD aiding deployment.
Abstract
Linear regression is a fundamental tool for statistical analysis, which has motivated the development of linear regression methods that satisfy provable privacy guarantees so that the learned model reveals little about any one data point used to construct it. Most existing privacy-preserving linear regression methods rely on the well-established framework of differential privacy, while the newly proposed PAC Privacy has not yet been explored in this context. In this paper, we systematically compare linear regression models trained with differential privacy and PAC privacy across three real-world datasets, observing several key findings that impact the performance of privacy-preserving linear regression.
