Optimized Tradeoffs for Private Prediction with Majority Ensembling
Shuli Jiang, Qiuyi, Zhang, Gauri Joshi
TL;DR
The Data-dependent Randomized Response Majority algorithm is introduced, parameterized by a data-dependent noise function $\gamma$, and enables efficient utility optimization over the class of all private algorithms, encompassing those standard methods.
Abstract
We study a classical problem in private prediction, the problem of computing an $(mε, δ)$-differentially private majority of $K$ $(ε, Δ)$-differentially private algorithms for $1 \leq m \leq K$ and $1 > δ\geq Δ\geq 0$. Standard methods such as subsampling or randomized response are widely used, but do they provide optimal privacy-utility tradeoffs? To answer this, we introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. It is parameterized by a data-dependent noise function $γ$, and enables efficient utility optimization over the class of all private algorithms, encompassing those standard methods. We show that maximizing the utility of an $(mε, δ)$-private majority algorithm can be computed tractably through an optimization problem for any $m \leq K$ by a novel structural result that reduces the infinitely many privacy constraints into a polynomial set. In some settings, we show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility. Lastly, we demonstrate the strong empirical effectiveness of our first-of-its-kind privacy-constrained utility optimization for ensembling labels for private prediction from private teachers in image classification. Notably, our DaRRM framework with an optimized $γ$ exhibits substantial utility gains when compared against several baselines.
