Table of Contents
Fetching ...

Noise-Aware Differentially Private Variational Inference

Talal Alrawajfeh, Joonas Jälkö, Antti Honkela

TL;DR

This work addresses uncertainty quantification for Bayesian inference under differential privacy by introducing Noise-Aware DP Variational Inference (NA-DPVI). NA-DPVI integrates DP-induced gradient noise into a post-processing gradient-trace model, inferring latent quantities such as the Hessian and optimal parameters to yield a noise-aware posterior $\widetilde{p}(\boldsymbol{\theta}|\mathcal{T})$. The authors provide theoretical justification (Theorem 1 and related analysis) and validate the method across high-dimensional and non-conjugate settings, including 10D Bayesian linear regression and UCI Adult Bayesian logistic regression, demonstrating improved coverage and calibrated predictive distributions compared to baselines. The approach broadens the applicability of noise-aware inference beyond simple models and offers a principled framework for uncertainty quantification in privacy-preserving Bayesian analyses, though it remains tied to VI quality and practical privacy-utility considerations such as hyperparameter privacy leakage.

Abstract

Differential privacy (DP) provides robust privacy guarantees for statistical inference, but this can lead to unreliable results and biases in downstream applications. While several noise-aware approaches have been proposed which integrate DP perturbation into the inference, they are limited to specific types of simple probabilistic models. In this work, we propose a novel method for noise-aware approximate Bayesian inference based on stochastic gradient variational inference which can also be applied to high-dimensional and non-conjugate models. We also propose a more accurate evaluation method for noise-aware posteriors. Empirically, our inference method has similar performance to existing methods in the domain where they are applicable. Outside this domain, we obtain accurate coverages on high-dimensional Bayesian linear regression and well-calibrated predictive probabilities on Bayesian logistic regression with the UCI Adult dataset.

Noise-Aware Differentially Private Variational Inference

TL;DR

This work addresses uncertainty quantification for Bayesian inference under differential privacy by introducing Noise-Aware DP Variational Inference (NA-DPVI). NA-DPVI integrates DP-induced gradient noise into a post-processing gradient-trace model, inferring latent quantities such as the Hessian and optimal parameters to yield a noise-aware posterior . The authors provide theoretical justification (Theorem 1 and related analysis) and validate the method across high-dimensional and non-conjugate settings, including 10D Bayesian linear regression and UCI Adult Bayesian logistic regression, demonstrating improved coverage and calibrated predictive distributions compared to baselines. The approach broadens the applicability of noise-aware inference beyond simple models and offers a principled framework for uncertainty quantification in privacy-preserving Bayesian analyses, though it remains tied to VI quality and practical privacy-utility considerations such as hyperparameter privacy leakage.

Abstract

Differential privacy (DP) provides robust privacy guarantees for statistical inference, but this can lead to unreliable results and biases in downstream applications. While several noise-aware approaches have been proposed which integrate DP perturbation into the inference, they are limited to specific types of simple probabilistic models. In this work, we propose a novel method for noise-aware approximate Bayesian inference based on stochastic gradient variational inference which can also be applied to high-dimensional and non-conjugate models. We also propose a more accurate evaluation method for noise-aware posteriors. Empirically, our inference method has similar performance to existing methods in the domain where they are applicable. Outside this domain, we obtain accurate coverages on high-dimensional Bayesian linear regression and well-calibrated predictive probabilities on Bayesian logistic regression with the UCI Adult dataset.

Paper Structure

This paper contains 32 sections, 5 theorems, 145 equations, 4 figures, 3 tables, 4 algorithms.

Key Result

Theorem 1

Under Assumptions Assumption-existence-of-optimum-for-objective, Assumption-taylor-approximation-of-loss, Assumption-per-example-loss-is-lipschitz, and Assumption-bounded-sgd-covariance, there exists a matrix $\mathbf{A}$ such that where $e_{\text{approx}}^2 = e^2_{\text{sub}} + (\kappa \times e_{\text{tay}} \times r^{*})^2$.

Figures (4)

  • Figure 1: An example of the linear model of the perturbed gradients $\kappa\mathbf{A}(\pmb{\phi}_t - \pmb{\phi}^{*})$ (black line) based on \ref{['expfam-1']}.
  • Figure 2: The first row in the figure shows the TARP coverages for the exponential families experiment for NA-DPVI (NUTS), Bernstein & Sheldon's method BernsteinS18, last iterate DPVI, and DPVIm dpvim. The second row shows the error for the coverages ($C(\alpha) - (1 - \alpha)$). The solid lines show the average performance over $20$ independent TARP repetitions and the error bars show the corresponding std. The parameters for NA-DPVI are $\delta = 10^{-5}$, $N = 5000$, $\kappa = 0.1$ and $T = 10^4$.
  • Figure 3: The first row in the figure shows the TARP coverages for the 10D Bayesian linear regression experiment for NA-DPVI (NUTS, Laplace), last iterate DPVI, DPVIm dpvim, and Gibbs-SS-Noisy BernsteinS19. The second row shows the error for the coverages ($C(\alpha) - (1 - \alpha)$). The solid lines show the average performance over $20$ independent TARP repetitions and the error bars show the corresponding std. The parameters for NA-DPVI are $\delta = 10^{-5}$, $N = 5000$, $\kappa = 0.1$ and $T = 10^4$.
  • Figure 4: The first row in the figure shows the predictive calibration for the UCI Adult logistic regression experiment for NA-DPVI (NUTS), NA-DPVI (Laplace), last iterate DPVI, and DPVIm dpvim. The second row shows the calibration error (Fraction of Positives - Predicted Probability). The solid lines show the average performance over $20$ independent repetitions, and the error bars show the corresponding std. The parameters for NA-DPVI are $\delta = 10^{-5}$, $\kappa = 0.1$ and $T = 10^4$.

Theorems & Definitions (13)

  • Definition 1: pmlr-v202-lemos23a
  • Definition 2
  • Definition 3
  • Theorem 1
  • proof
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • proof
  • proof
  • ...and 3 more