Practical Differentially Private Hyperparameter Tuning with Subsampling
Antti Koskela, Tejas Kulkarni
TL;DR
The paper addresses the high privacy cost and computational burden of hyperparameter tuning for differentially private (DP) machine learning. It introduces a method that tunes hyperparameters on a small random subset and extrapolates to the full data, underpinned by a Rényi differential privacy analysis. The approach reduces both the DP budget and computational overhead, outperforming the Papernot and Steinke baseline in privacy-utility trade-offs for DP-SGD and DP-Adam across standard datasets. It provides grid-search with randomized hyperparameter selection and tailored privacy accounting bounds, with practical implications for scalable private learning.
Abstract
Tuning the hyperparameters of differentially private (DP) machine learning (ML) algorithms often requires use of sensitive data and this may leak private information via hyperparameter values. Recently, Papernot and Steinke (2022) proposed a certain class of DP hyperparameter tuning algorithms, where the number of random search samples is randomized itself. Commonly, these algorithms still considerably increase the DP privacy parameter $\varepsilon$ over non-tuned DP ML model training and can be computationally heavy as evaluating each hyperparameter candidate requires a new training run. We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data for the hyperparameter tuning and by extrapolating the optimal values to a larger dataset. We provide a Rényi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off than the baseline method by Papernot and Steinke.
