Scaling up the Banded Matrix Factorization Mechanism for Differentially Private ML
Ryan McKenna
TL;DR
DP-BandMF scales correlated-noise differential privacy for large-scale ML by introducing (i) efficient strategy optimization that avoids dense $n\times n$ matrices, (ii) banded Toeplitz strategy families for further efficiency, and (iii) distributed noise generation across thousands of machines. The approach reduces the dominant computational burdens from $O(n^3)$ time and $O(n^2)$ memory to near-linear scaling in $n$ for structured classes, enabling training with DP over extremely large iteration counts and parameter counts with negligible utility loss. Empirical results show amplified DP-BandMF outperforming DP-SGD and other scalable MF approaches across a range of settings, with the optimal number of bands $b_*$ roughly following $b_* \approx \epsilon \sqrt{n}/k$ and banded Toeplitz variants offering near-optimal performance in very large regimes. The work provides practical, scalable DP matrix-factorization tooling for large-scale private ML, with implications for federated and centralized training where privacy and efficiency must co-exist.
Abstract
Correlated noise mechanisms such as DP Matrix Factorization (DP-MF) have proven to be effective alternatives to DP-SGD in large-epsilon few-epoch training regimes. Significant work has been done to find the best correlated noise strategies, and the current state-of-the-art approach is DP-BandMF, which optimally balances the benefits of privacy amplification and noise correlation. Despite it's utility advantages, severe scalability limitations prevent this mechanism from handling large-scale training scenarios where the number of training iterations may exceed $10^4$ and the number of model parameters may exceed $10^7$. In this work, we present techniques to scale up DP-BandMF along these two dimensions, significantly extending it's reach and enabling it to handle settings with virtually any number of model parameters and training iterations, with negligible utility degradation.
