Table of Contents
Fetching ...

Are Normalizing Flows the Key to Unlocking the Exponential Mechanism?

Robert A. Bridges, Vandy J. Tombs, Christopher B. Stanley

TL;DR

This work investigates making the Exponential Mechanism practical for private optimization on continuous spaces by pairing it with Normalizing Flows (ExpM+NF). It demonstrates that an NF can approximate ExpM sampling, enabling near-non-private levels of accuracy on MIMIC-III benchmarks and faster training than DPSGD, while preserving stronger privacy in principle. The paper provides a concrete sensitivity bound for the $\ell^2$ loss, explores Bayesian inference via ExpM+NF, and conducts empirical privacy experiments (LiRA and Steinke-audited lower bounds) that yield mixed results, highlighting the gap between empirical privacy and formal guarantees. It also advances theoretical tools, such as DP distance and a Privacy Squeeze Theorem, to pave the way toward a formal DP proof for ExpM+NF, while clearly outlining current limitations and future directions for rigorous privacy guarantees and broader applicability.

Abstract

The Exponential Mechanism (ExpM), designed for private optimization, has been historically sidelined from use on continuous sample spaces, as it requires sampling from a generally intractable density, and, to a lesser extent, bounding the sensitivity of the objective function. Any differential privacy (DP) mechanism can be instantiated as ExpM, and ExpM poses an elegant solution for private machine learning (ML) that bypasses inherent inefficiencies of DPSGD. This paper seeks to operationalize ExpM for private optimization and ML by using an auxiliary Normalizing Flow (NF), an expressive deep network for density learning, to approximately sample from ExpM density. The method, ExpM+NF is an alternative to SGD methods for model training. We prove a sensitivity bound for the $\ell^2$ loss permitting ExpM use with any sampling method. To test feasibility, we present results on MIMIC-III health data comparing (non-private) SGD, DPSGD, and ExpM+NF training methods' accuracy and training time. We find that a model sampled from ExpM+NF is nearly as accurate as non-private SGD, more accurate than DPSGD, and ExpM+NF trains faster than Opacus' DPSGD implementation. Unable to provide a privacy proof for the NF approximation, we present empirical results to investigate privacy including the LiRA membership inference attack of Carlini et al. and the recent privacy auditing lower bound method of Steinke et al. Our findings suggest ExpM+NF provides more privacy than non-private SGD, but not as much as DPSGD, although many attacks are impotent against any model. Ancillary benefits of this work include pushing the SOTA of privacy and accuracy on MIMIC-III healthcare data, exhibiting the use of ExpM+NF for Bayesian inference, showing the limitations of empirical privacy auditing in practice, and providing several privacy theorems applicable to distribution learning.

Are Normalizing Flows the Key to Unlocking the Exponential Mechanism?

TL;DR

This work investigates making the Exponential Mechanism practical for private optimization on continuous spaces by pairing it with Normalizing Flows (ExpM+NF). It demonstrates that an NF can approximate ExpM sampling, enabling near-non-private levels of accuracy on MIMIC-III benchmarks and faster training than DPSGD, while preserving stronger privacy in principle. The paper provides a concrete sensitivity bound for the loss, explores Bayesian inference via ExpM+NF, and conducts empirical privacy experiments (LiRA and Steinke-audited lower bounds) that yield mixed results, highlighting the gap between empirical privacy and formal guarantees. It also advances theoretical tools, such as DP distance and a Privacy Squeeze Theorem, to pave the way toward a formal DP proof for ExpM+NF, while clearly outlining current limitations and future directions for rigorous privacy guarantees and broader applicability.

Abstract

The Exponential Mechanism (ExpM), designed for private optimization, has been historically sidelined from use on continuous sample spaces, as it requires sampling from a generally intractable density, and, to a lesser extent, bounding the sensitivity of the objective function. Any differential privacy (DP) mechanism can be instantiated as ExpM, and ExpM poses an elegant solution for private machine learning (ML) that bypasses inherent inefficiencies of DPSGD. This paper seeks to operationalize ExpM for private optimization and ML by using an auxiliary Normalizing Flow (NF), an expressive deep network for density learning, to approximately sample from ExpM density. The method, ExpM+NF is an alternative to SGD methods for model training. We prove a sensitivity bound for the loss permitting ExpM use with any sampling method. To test feasibility, we present results on MIMIC-III health data comparing (non-private) SGD, DPSGD, and ExpM+NF training methods' accuracy and training time. We find that a model sampled from ExpM+NF is nearly as accurate as non-private SGD, more accurate than DPSGD, and ExpM+NF trains faster than Opacus' DPSGD implementation. Unable to provide a privacy proof for the NF approximation, we present empirical results to investigate privacy including the LiRA membership inference attack of Carlini et al. and the recent privacy auditing lower bound method of Steinke et al. Our findings suggest ExpM+NF provides more privacy than non-private SGD, but not as much as DPSGD, although many attacks are impotent against any model. Ancillary benefits of this work include pushing the SOTA of privacy and accuracy on MIMIC-III healthcare data, exhibiting the use of ExpM+NF for Bayesian inference, showing the limitations of empirical privacy auditing in practice, and providing several privacy theorems applicable to distribution learning.
Paper Structure (34 sections, 10 theorems, 4 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 34 sections, 10 theorems, 4 equations, 9 figures, 6 tables, 1 algorithm.

Key Result

Theorem 2.4

The Exponential Mechanism (ExpM) i.e., sampling $\theta \sim p_{ExpM}(\theta, X) \propto \exp(\varepsilon u(\theta, X) / (2s))$ where utility function $u$ has sensitivity $s$, provides $\varepsilon$-DP.

Figures (9)

  • Figure 1: Figure depicts a to-be-publicized model trained on private data. New technique, ExpM+NF, trains an auxiliary Normalizing Flow (NF) model to approximately sample from the otherwise intractable Exponential Mechanism (ExpM) density to produce near-optimal model parameters $\theta$.
  • Figure 2: Both the left and the right plots depict histogram and Q-Q plots for a random data point. The blue are the logit scaled confidences when the data point is included in the shadow model's training set and the green are the logit scaled confidences when the data point is not a member. These show that the per-example logit scaled confidences are nearly Gaussian.
  • Figure 3: Likelihood Ratio Attack (median) ROC curve with 5th and 95th percentiles shaded on a log-log for Logistic Regression on Mortality Task.
  • Figure 4: Likelihood Ratio Attack (median) ROC curve with 5th and 95th percentiles shaded on a log-log for Logistic Regression on Length of Stay Task.
  • Figure 5: Left density is the target ExpM density, $p_X$ for training a logistic regression classifier with two parameters. Right density is the NF approximation density, $q_X$ of $p_x$. Comparison shows the NF approximation is too precise (too peaked at the mode) to be a good approximation. Hence, sampling from the NF approximation provides accurate model parameters with high likelihood, but intuitively should have strictly less privacy than the target ExpM density, as it has too little variance. Compare with Figure \ref{['fig:expm_vary_eps']}.
  • ...and 4 more figures

Theorems & Definitions (16)

  • Definition 2.1: Differential Privacy
  • Definition 2.2: Sensitivity
  • Definition 2.3: ExpM
  • Theorem 2.4: ExpM satisfies $\varepsilon$-DP mcsherry2007mechanismdwork2014algorithmic
  • Theorem 3.1: Sensitivity bound for $\ell^2$ loss
  • Corollary 3.2
  • Definition 7.1: DP Distance on $\mathcal{P}$
  • Proposition 7.2
  • Theorem 7.3
  • Theorem 7.4
  • ...and 6 more