A Refreshment Stirred, Not Shaken (II): Invariant-Preserving Deployments of Differential Privacy for the US Decennial Census
James Bailie, Ruobin Gong, Xiao-Li Meng
TL;DR
The paper develops an invariant-aware framework for differential privacy deployments in census data, explicitly analyzing two classic SDC methods: the Permutation Swapping Algorithm (PSA) and the TopDown Algorithm (TDA). It formalizes a unified DP specification system with building blocks and multiverse invariants, enabling rigorous DP guarantees for invariant-preserving mechanisms. The PSA is shown to satisfy ε-DP subject to its invariants, with the privacy budget depending on swap rate and stratum size, while the TDA is shown to satisfy a zCDP-based DP specification under its invariants, illustrating that invariants critically shape actual privacy protection. Through numerical demonstrations (e.g., 1940 Census) and counterfactual comparisons to the 2020 DAS, the work clarifies how invariants influence both security guarantees and data utility, and highlights the importance of careful interpretation when translating theoretical DP guarantees into practical privacy protection. Overall, the paper provides a principled, multi-building-block lens to compare traditional SDC methods with modern DP deployments and emphasizes the nuanced role of invariants in determining true privacy protection.
Abstract
Through the lens of the system of differential privacy specifications developed in Part I of a trio of articles, this second paper examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which is similar to the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2020 DAS. To varying degrees, both methods leave unaltered some statistics of the confidential data $\unicode{x2013}$ which are called the method's invariants $\unicode{x2013}$ and hence neither can be readily reconciled with differential privacy (DP), at least as it was originally conceived. Nevertheless, we establish that the PSA satisfies $\varepsilon$-DP subject to the invariants it necessarily induces, thereby showing that this traditional SDC method can in fact still be understood within our more-general system of DP specifications. By a similar modification to $ρ$-zero concentrated DP, we also provide a DP specification for the TDA. Finally, as a point of comparison, we consider the counterfactual scenario in which the PSA was adopted for the 2020 Census, resulting in a reduction in the nominal privacy loss, but at the cost of releasing many more invariants. Therefore, while our results explicate the mathematical guarantees of SDC provided by the PSA, the TDA and the 2020 DAS in general, care must be taken in their translation to actual privacy protection $\unicode{x2013}$ just as is the case for any DP deployment.
