Differential Confounding Privacy and Inverse Composition
Tao Zhang, Bradley A. Malin, Netanel Raviv, Yevgeniy Vorobeychik
TL;DR
This work extends differential privacy to settings where the secret $S$ is not simply contained in the dataset $X$, by introducing differential confounding privacy (DCP), a Pufferfish-inspired framework that uses $\epsilon$-$\delta$ indistinguishability to quantify privacy loss under general S–X dependencies. It analyzes how DCP compositions differ from DP, showing that while DCP compositions exist, they lack DP’s graceful, additive bounds due to copula-driven dependencies among mechanisms. To address this, the authors propose Inverse Composition (IC), a leader–follower optimization that designs a privacy strategy to guarantee target $(\varepsilon, \delta)$-DCP under composition without relying on worst-case proofs, with convex reformulations under a strictly proper scoring rule. They validate IC through numerical experiments on genomic data, demonstrating that IC can meet privacy budgets under composition and can leverage copula perturbations to manage dependencies. The results offer a principled pathway to privacy accounting in complex, interdependent data-processing pipelines and highlight remaining challenges in algorithmic implementation and scalability.
Abstract
Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but its applicability can be limited in scenarios involving complex dependencies between sensitive information and datasets. To address this, we introduce \textit{differential confounding privacy} (DCP), a specialized form of the Pufferfish privacy (PP) framework that generalizes DP by accounting for broader relationships between sensitive information and datasets. DCP adopts the $(ε, δ)$-indistinguishability framework to quantify privacy loss. We show that while DCP mechanisms retain privacy guarantees under composition, they lack the graceful compositional properties of DP. To overcome this, we propose an \textit{Inverse Composition (IC)} framework, where a leader-follower model optimally designs a privacy strategy to achieve target guarantees without relying on worst-case privacy proofs, such as sensitivity calculation. Experimental results validate IC's effectiveness in managing privacy budgets and ensuring rigorous privacy guarantees under composition.
