Table of Contents
Fetching ...

Random (Un)rounding : Vulnerabilities in Discrete Attribute Disclosure in the 2021 Canadian Census

Christopher West, Vecna, Raiyan Chowdhury

TL;DR

This work exposes vulnerabilities in the 2021 Canadian census privacy mechanism based on random rounding to multiples of five, showing that correlations among discrete attributes and published population invariants can enable both exact and probabilistic disclosures. It develops invariant-based and invariant-free exact and probabilistic inference methods, leveraging SAT/SMT solvers to reconstruct true values and to assign likelihoods to feasible solutions, and reports 624 exact disclosures (via invariants) along with substantial probabilistic inferences (on the order of thousands of values). The authors also propose a mitigation—unbounded discrete noise using a discrete Laplace distribution with $t=1.45$—that preserves utility while obstructing exact unrounding and SAT-enumeration attacks, highlighting practical implications for census privacy design and prompting further work on robust privacy protections.

Abstract

The 2021 Canadian census is notable for using a unique form of privacy, random rounding, which independently and probabilistically rounds discrete numerical attribute values. In this work, we explore how hierarchical summative correlation between discrete variables allows for both probabilistic and exact solutions to attribute values in the 2021 Canadian Census disclosure. We demonstrate that, in some cases, it is possible to "unround" and extract the original private values before rounding, both in the presence and absence of provided population invariants. Using these methods, we expose the exact value of 624 previously private attributes in the 2021 Canadian census disclosure. We also infer the potential values of more than 1000 private attributes with a high probability of correctness. Finally, we propose how a simple solution based on unbounded discrete noise can effectively negate exact unrounding while maintaining high utility in the final product.

Random (Un)rounding : Vulnerabilities in Discrete Attribute Disclosure in the 2021 Canadian Census

TL;DR

This work exposes vulnerabilities in the 2021 Canadian census privacy mechanism based on random rounding to multiples of five, showing that correlations among discrete attributes and published population invariants can enable both exact and probabilistic disclosures. It develops invariant-based and invariant-free exact and probabilistic inference methods, leveraging SAT/SMT solvers to reconstruct true values and to assign likelihoods to feasible solutions, and reports 624 exact disclosures (via invariants) along with substantial probabilistic inferences (on the order of thousands of values). The authors also propose a mitigation—unbounded discrete noise using a discrete Laplace distribution with —that preserves utility while obstructing exact unrounding and SAT-enumeration attacks, highlighting practical implications for census privacy design and prompting further work on robust privacy protections.

Abstract

The 2021 Canadian census is notable for using a unique form of privacy, random rounding, which independently and probabilistically rounds discrete numerical attribute values. In this work, we explore how hierarchical summative correlation between discrete variables allows for both probabilistic and exact solutions to attribute values in the 2021 Canadian Census disclosure. We demonstrate that, in some cases, it is possible to "unround" and extract the original private values before rounding, both in the presence and absence of provided population invariants. Using these methods, we expose the exact value of 624 previously private attributes in the 2021 Canadian census disclosure. We also infer the potential values of more than 1000 private attributes with a high probability of correctness. Finally, we propose how a simple solution based on unbounded discrete noise can effectively negate exact unrounding while maintaining high utility in the final product.
Paper Structure (23 sections, 1 equation, 4 figures, 6 tables)

This paper contains 23 sections, 1 equation, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Invariant-based Inference, with 1-2 Levels of Potential Compound Inferences Shown
  • Figure 2: Feature Probability Histograms
  • Figure 3: Signed Distance PDFs
  • Figure :

Theorems & Definitions (1)

  • definition 1