DP-HYPE: Distributed Differentially Private Hyperparameter Search
Johannes Liebenow, Thorsten Peinemann, Esfandiar Mohammadi
TL;DR
DP-Hype tackles the challenge of privately tuning hyperparameters in distributed learning without a trusted aggregator. It combines per-client local evaluations, a k-vote private voting scheme, and secure aggregation with Rényi-DP-based privacy accounting to select a compromise hyperparameter that generalizes across clients. The method achieves client-level DP independent of the hyperparameter count and provides utility guarantees with explicit bounds, while remaining scalable to large numbers of clients and hyperparameters. Empirical results on MNIST, CIFAR-10, and Adult demonstrate strong performance under iid and non-iid data, even for small privacy budgets, and the approach is implemented as a Flower submodule for practical use.
Abstract
The tuning of hyperparameters in distributed machine learning can substantially impact model performance. When the hyperparameters are tuned on sensitive data, privacy becomes an important challenge and to this end, differential privacy has emerged as the de facto standard for provable privacy. A standard setting when performing distributed learning tasks is that clients agree on a shared setup, i.e., find a compromise from a set of hyperparameters, like the learning rate of the model to be trained. Yet, prior work on differentially private hyperparameter tuning either uses computationally expensive cryptographic protocols, determines hyperparameters separately for each client, or applies differential privacy locally, which can lead to undesirable utility-privacy trade-offs. In this work, we present our algorithm DP-HYPE, which performs a distributed and privacy-preserving hyperparameter search by conducting a distributed voting based on local hyperparameter evaluations of clients. In this way, DP-HYPE selects hyperparameters that lead to a compromise supported by the majority of clients, while maintaining scalability and independence from specific learning tasks. We prove that DP-HYPE preserves the strong notion of differential privacy called client-level differential privacy and, importantly, show that its privacy guarantees do not depend on the number of hyperparameters. We also provide bounds on its utility guarantees, that is, the probability of reaching a compromise, and implement DP-HYPE as a submodule in the popular Flower framework for distributed machine learning. In addition, we evaluate performance on multiple benchmark data sets in iid as well as multiple non-iid settings and demonstrate high utility of DP-HYPE even under small privacy budgets.
