Residual-PAC Privacy: Automatic Privacy Control Beyond the Gaussian Barrier
Tao Zhang, Yevgeniy Vorobeychik
TL;DR
The paper tackles the inefficiency of Gaussian-based Auto-PAC privacy by introducing Residual-PAC (R-PAC) and Stackelberg SR-PAC, which quantify and exploit non-Gaussian structure in data to better allocate privacy noise. It first characterizes the Gaussian barrier in PAC Privacy and then provides two post-processing corrections using DV representations and sliced Wasserstein distances. Building on this, it defines R-PAC privacy to quantify residual privacy via f-divergences and implements SR-PAC as a convex Stackelberg game to automatically choose optimal noise distributions that maximize utility while respecting a privacy budget. The proposed approach yields tighter privacy budgets, anisotropic and directional noise tailored to data geometry, and provable advantages in composition, as demonstrated by extensive experiments against PAC and DP baselines. Overall, SR-PAC delivers improved privacy-utility tradeoffs across diverse datasets and distributions, with strong theoretical guarantees and practical scalability through Monte Carlo simulation.
Abstract
The Probably Approximately Correct (PAC) Privacy framework [46] provides a powerful instance-based methodology to preserve privacy in complex data-driven systems. Existing PAC Privacy algorithms (we call them Auto-PAC) rely on a Gaussian mutual information upper bound. However, we show that the upper bound obtained by these algorithms is tight if and only if the perturbed mechanism output is jointly Gaussian with independent Gaussian noise. We propose two approaches for addressing this issue. First, we introduce two tractable post-processing methods for Auto-PAC, based on Donsker-Varadhan representation and sliced Wasserstein distances. However, the result still leaves wasted privacy budget. To address this issue more fundamentally, we introduce Residual-PAC (R-PAC) Privacy, an f-divergence-based measure to quantify privacy that remains after adversarial inference. To implement R-PAC Privacy in practice, we propose a Stackelberg Residual-PAC (SR-PAC) privatization mechanism, a game-theoretic framework that selects optimal noise distributions through convex bilevel optimization. Our approach achieves efficient privacy budget utilization for arbitrary data distributions and naturally composes when multiple mechanisms access the dataset. Through extensive experiments, we demonstrate that SR-PAC consistently obtains a better privacy-utility tradeoff than both PAC and differential privacy baselines.
