Table of Contents
Fetching ...

Privacy Amplification for BandMF via $b$-Min-Sep Subsampling

Andy Dong, Arun Ganesh

TL;DR

This work addresses privacy amplification for BandMF under correlated noise by introducing $b$-min-sep subsampling, which unifies Poisson and balls-in-bins schemes and yields stronger amplification. It develops near-exact privacy accounting via Monte Carlo methods, enabled by a dynamic-programming formulation that efficiently computes the likelihood ratio for the resulting Gaussian mixtures. The authors prove that $b$-min-sep matches cyclic Poisson in the high-noise regime and strictly improves in mid-to-low noise, with empirical gains on CIFAR and a multi-attribution arXiv setting, and show the approach naturally extends to user-level privacy. The framework provides a principled, scalable foundation for privacy amplification in correlated-noise DP-SGD/DP-MF and offers a practical path toward stronger, more general privacy guarantees in federated and multi-user contexts.

Abstract

We study privacy amplification for BandMF, i.e., DP-SGD with correlated noise across iterations via a banded correlation matrix. We propose $b$-min-sep subsampling, a new subsampling scheme that generalizes Poisson and balls-in-bins subsampling, extends prior practical batching strategies for BandMF, and enables stronger privacy amplification than cyclic Poisson while preserving the structural properties needed for analysis. We give a near-exact privacy analysis using Monte Carlo accounting, based on a dynamic program that leverages the Markovian structure in the subsampling procedure. We show that $b$-min-sep matches cyclic Poisson subsampling in the high noise regime and achieves strictly better guarantees in the mid-to-low noise regime, with experimental results that bolster our claims. We further show that unlike previous BandMF subsampling schemes, our $b$-min-sep subsampling naturally extends to the multi-attribution user-level privacy setting.

Privacy Amplification for BandMF via $b$-Min-Sep Subsampling

TL;DR

This work addresses privacy amplification for BandMF under correlated noise by introducing -min-sep subsampling, which unifies Poisson and balls-in-bins schemes and yields stronger amplification. It develops near-exact privacy accounting via Monte Carlo methods, enabled by a dynamic-programming formulation that efficiently computes the likelihood ratio for the resulting Gaussian mixtures. The authors prove that -min-sep matches cyclic Poisson in the high-noise regime and strictly improves in mid-to-low noise, with empirical gains on CIFAR and a multi-attribution arXiv setting, and show the approach naturally extends to user-level privacy. The framework provides a principled, scalable foundation for privacy amplification in correlated-noise DP-SGD/DP-MF and offers a practical path toward stronger, more general privacy guarantees in federated and multi-user contexts.

Abstract

We study privacy amplification for BandMF, i.e., DP-SGD with correlated noise across iterations via a banded correlation matrix. We propose -min-sep subsampling, a new subsampling scheme that generalizes Poisson and balls-in-bins subsampling, extends prior practical batching strategies for BandMF, and enables stronger privacy amplification than cyclic Poisson while preserving the structural properties needed for analysis. We give a near-exact privacy analysis using Monte Carlo accounting, based on a dynamic program that leverages the Markovian structure in the subsampling procedure. We show that -min-sep matches cyclic Poisson subsampling in the high noise regime and achieves strictly better guarantees in the mid-to-low noise regime, with experimental results that bolster our claims. We further show that unlike previous BandMF subsampling schemes, our -min-sep subsampling naturally extends to the multi-attribution user-level privacy setting.
Paper Structure (24 sections, 4 theorems, 40 equations, 12 figures, 5 algorithms)

This paper contains 24 sections, 4 theorems, 40 equations, 12 figures, 5 algorithms.

Key Result

Theorem 5.1

For a given $b$-banded (and non-negative, lower-triangular) $\mathbf{C}$, and any output $\mathbf{y}$, we have that eq:mainrecursion holds (and thus $f_1(\mathbf{y}) = P(\mathbf{y}) / Q(\mathbf{y})$) when using fig:bms_basic to sample $\mathbf{x}$ in $P$.

Figures (12)

  • Figure 1: Markov chain governing the availability state of a single example in $b$-min-sep subsampling.
  • Figure 2: MSE achieved by different privacy amplification methods for the CIFAR setting.
  • Figure 3: Left: Average test accuracy with 95% confidence intervals for different $\varepsilon$ values and sampling schemes. Middle, right: The same, but with either the test accuracy of cyclic Poisson or balls-in-bins subtracted.
  • Figure 4: MSE achieved by different privacy amplification methods for the CIFAR setting. For balls-in-bins and $b$-min-sep subsampling, we assume optimistic Monte Carlo accounting.
  • Figure 5: Left: Average test accuracy with 95% confidence intervals for different $\varepsilon$ values and sampling schemes. Middle, right: The same, but with either the test accuracy of cyclic Poisson or balls-in-bins subtracted. We assume optimistic Monte Carlo accounting, i.e. assume the estimates given by Monte Carlo estimation are exact.
  • ...and 7 more figures

Theorems & Definitions (11)

  • Definition 2.1
  • Definition 2.2
  • Theorem 5.1
  • proof
  • Theorem 6.1
  • proof
  • Conjecture 6.2: Validity of Monte Carlo recursion for participation-based blocking
  • Theorem A.1
  • proof
  • Lemma B.1
  • ...and 1 more