Metric geometry of the privacy-utility tradeoff
March Boedihardjo, Thomas Strohmer, Roman Vershynin
TL;DR
This work develops a geometry-driven framework for the privacy-utility tradeoff under metric privacy by introducing the entropic scale $s(Z,\alpha)$, which encodes the multiscale packing geometry of the space. The authors establish tight connections between optimal accuracy $A(Z,\alpha)$ and $s(Z,\alpha)$, including a diametric generalization $s_\circ(Z,\alpha)$ for disconnected spaces, and derive sharp results for the ultrametric setting. Two main results show that, when $\rho_1=\rho_2$, $A(Z,\alpha)$ is sandwiched by $s(Z,\cdot)$ up to constants (with $A(Z,\alpha) \asymp s(Z,\alpha)$ for norm-convex spaces), while in the ultrametric case the bound involves the sum $s(Z,2\alpha)+s_\circ(Z,2\alpha)$. The paper also provides universal lower bounds and concrete examples (unit balls, Wasserstein spaces, Lipschitz function spaces, and Boolean cubes) that demonstrate the practical relevance of the entropic/diametric scales for characterizing the privacy-utility landscape in non-discrete settings.
Abstract
Synthetic data are an attractive concept to enable privacy in data sharing. A fundamental question is how similar the privacy-preserving synthetic data are compared to the true data. Using metric privacy, an effective generalization of differential privacy beyond the discrete setting, we raise the problem of characterizing the optimal privacy-accuracy tradeoff by the metric geometry of the underlying space. We provide a partial solution to this problem in terms of the "entropic scale", a quantity that captures the multiscale geometry of a metric space via the behavior of its packing numbers. We illustrate the applicability of our privacy-accuracy tradeoff framework via a diverse set of examples of metric spaces.
