Efficient Sparse Least Absolute Deviation Regression with Differential Privacy
Weidong Liu, Xiaojun Mao, Xiaofei Zhang, Xin Zhang
TL;DR
This work tackles privacy-preserving sparse regression under robust loss by focusing on least absolute deviation (LAD) with an $\ell_1$ penalty. It introduces FRAPPE, a fast algorithm that transforms the non-smooth LAD problem into a surrogate least-squares problem via a pseudo-response, and secures $(\epsilon,\delta)$-DP through a three-stage noise injection across initialization, kernel-density estimation, and gradient perturbation. Theoretical results establish DP guarantees and near-oracle statistical accuracy, showing a privacy-accuracy trade-off that scales with $O\left(\sqrt{p \log(1/\delta) \log(N\epsilon)} /(N\epsilon)\right)$ plus the classical $O\left(\sqrt{s \log p / N}\right)$ rate. Empirical evaluations on synthetic and real data demonstrate that FRAPPE outperforms existing private sparse regression methods, especially under heavy-tailed noise, while maintaining computational efficiency.
Abstract
In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(ε,δ)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm.
