Distributed DP-Helmet: Scalable Differentially Private Non-interactive Averaging of Single Layers
Moritz Kirschte, Sebastian Meiser, Saman Ardalan, Esfandiar Mohammadi
TL;DR
Distributed DP-Helmet introduces a non-interactive, scalable framework for differentially private learning across many users via blind averaging. Phase I performs local DP training using Softmax-SLP or SVM, Phase II uses secure summation to compute the average, yielding an $(\varepsilon,\delta)$-DP global model; the authors prove DP guarantees and show convergence results for hinge-loss SVM and Softmax-SLP. Empirical results on CIFAR-10/100 and federated EMNIST after SimCLR pretraining demonstrate strong utility at tight privacy budgets, with Softmax-SLP often outperforming SVM and robustness to non-IID data. The work also connects blind averaging to the representer theorem, offering insights toward convergence of broader ERMs and scalability to millions of users with robust privacy protections.
Abstract
In this work, we propose two differentially private, non-interactive, distributed learning algorithms in a framework called Distributed DP-Helmet. Our framework is based on what we coin blind averaging: each user locally learns and noises a model and all users then jointly compute the mean of their models via a secure summation protocol. We provide experimental evidence that blind averaging for SVMs and single Softmax-layer (Softmax-SLP) can have a strong utility-privacy tradeoff: we reach an accuracy of 86% on CIFAR-10 for $\varepsilon$ = 0.4 and 1,000 users, of 44% on CIFAR-100 for $\varepsilon$ = 1.2 and 100 users, and of 39% on federated EMNIST for $\varepsilon$ = 0.4 and 3,400 users, all after a SimCLR-based pretraining. As an ablation, we study the resilience of our approach to a strongly non-IID setting. On the theoretical side, we show that blind averaging preserves differential privacy if the objective function is smooth, Lipschitz, and strongly convex like SVMs. We show that these properties also hold for Softmax-SLP which is often used for last-layer fine-tuning such that for a fixed model size the privacy bound $\varepsilon$ of Softmax-SLP no longer depends on the number of classes. This marks a significant advantage in utility and privacy of Softmax-SLP over SVMs. Furthermore, in the limit blind averaging of hinge-loss SVMs convergences to a centralized learned SVM. The latter result is based on the representer theorem and can be seen as a blueprint for finding convergence for other empirical risk minimizers (ERM) like Softmax-SLP.
