Beyond Membership: Limitations of Add/Remove Adjacency in Differential Privacy
Gauri Pradhan, Joonas Jälkö, Santiago Zanella-Bèguelin, Antti Honkela
TL;DR
This work argues that the conventional add/remove adjacency used in differential privacy can overstate protection for per-record attributes, motivating substitute adjacency as a more accurate privacy notion for attribute privacy. It introduces canary-based auditing tools to empirically assess DP under substitute adjacency and shows that leakage observed in real models often aligns with substitute-DP budgets rather than add/remove bounds. Through gradient-space and input-space canaries across natural and synthetic datasets, the authors demonstrate that attribute or label privacy can be violated beyond add/remove guarantees, especially under high subsampling. The findings have practical implications for how privacy guarantees are reported and interpreted in DP-enabled ML pipelines, and point to the need for auditing methods that explicitly account for substitute adjacency and attribute leakage.
Abstract
Training machine learning models with differential privacy (DP) limits an adversary's ability to infer sensitive information about the training data. It can be interpreted as a bound on adversary's capability to distinguish two adjacent datasets according to chosen adjacency relation. In practice, most DP implementations use the add/remove adjacency relation, where two datasets are adjacent if one can be obtained from the other by adding or removing a single record, thereby protecting membership. In many ML applications, however, the goal is to protect attributes of individual records (e.g., labels used in supervised fine-tuning). We show that privacy accounting under add/remove overstates attribute privacy compared to accounting under the substitute adjacency relation, which permits substituting one record. To demonstrate this gap, we develop novel attacks to audit DP under substitute adjacency, and show empirically that audit results are inconsistent with DP guarantees reported under add/remove, yet remain consistent with the budget accounted under the substitute adjacency relation. Our results highlight that the choice of adjacency when reporting DP guarantees is critical when the protection target is per-record attributes rather than membership.
