Revisiting Hyperparameter Tuning with Differential Privacy
Youlong Ding, Xueyang Wu
TL;DR
The paper tackles privacy leakage from hyperparameter tuning under differential privacy and presents a framework that decouples the overall privacy loss from the hyperparameter search space, tying it instead to the final model utility $u^*$. It introduces Propose-test Hyperparameter Tuning with Doubling Step, leveraging AboveThreshold, Sparse Vector Technique, and Subsample-and-Aggregate to produce a low-sensitivity proxy utility and to manage exploration via a doubling strategy. Theoretical results guarantee a privacy bound of $\big(\varepsilon+\tilde{O}(\sqrt{(u^*-u_0)/g})\varepsilon_0, \delta\big)$ and a worst-case iteration count $T=O((u^*-u_0)/g)$, with empirical evidence suggesting near-logarithmic growth due to doubling. The framework enables grid-search style hyperparameter tuning under DP and offers a path toward more private yet high-utility model deployment in privacy-conscious ML workflows.
Abstract
Hyperparameter tuning is a common practice in the application of machine learning but is a typically ignored aspect in the literature on privacy-preserving machine learning due to its negative effect on the overall privacy parameter. In this paper, we aim to tackle this fundamental yet challenging problem by providing an effective hyperparameter tuning framework with differential privacy. The proposed method allows us to adopt a broader hyperparameter search space and even to perform a grid search over the whole space, since its privacy loss parameter is independent of the number of hyperparameter candidates. Interestingly, it instead correlates with the utility gained from hyperparameter searching, revealing an explicit and mandatory trade-off between privacy and utility. Theoretically, we show that its additional privacy loss bound incurred by hyperparameter tuning is upper-bounded by the squared root of the gained utility. However, we note that the additional privacy loss bound would empirically scale like a squared root of the logarithm of the utility term, benefiting from the design of doubling step.
