Table of Contents
Fetching ...

Efficient and Private Marginal Reconstruction with Local Non-Negativity

Brett Mullins, Miguel Fuentes, Yingtai Xiao, Daniel Kifer, Cameron Musco, Daniel Sheldon

TL;DR

The paper addresses private reconstruction of marginal queries under differential privacy by introducing ReM, a residuals-to-marginals post-processing method that reconstructs workload marginals from residual measurements using a Kronecker-structured representation. It extends to Gaussian noise with GReM-LNN, which enforces local non-negativity to reduce reconstruction error, and proves efficiency through residual decomposition and pseudoinverse mappings. Empirical results show substantial accuracy gains and scalability improvements when integrating ReM and GReM-LNN into private query mechanisms such as ResidualPlanner and Scalable MWEM. The work provides practical algorithms and complexity analyses, enabling accurate, scalable private marginal reconstruction without exponential blow-up in high-dimensional domains.

Abstract

Differential privacy is the dominant standard for formal and quantifiable privacy and has been used in major deployments that impact millions of people. Many differentially private algorithms for query release and synthetic data contain steps that reconstruct answers to queries from answers to other queries that have been measured privately. Reconstruction is an important subproblem for such mechanisms to economize the privacy budget, minimize error on reconstructed answers, and allow for scalability to high-dimensional datasets. In this paper, we introduce a principled and efficient postprocessing method ReM (Residuals-to-Marginals) for reconstructing answers to marginal queries. Our method builds on recent work on efficient mechanisms for marginal query release, based on making measurements using a residual query basis that admits efficient pseudoinversion, which is an important primitive used in reconstruction. An extension GReM-LNN (Gaussian Residuals-to-Marginals with Local Non-negativity) reconstructs marginals under Gaussian noise satisfying consistency and non-negativity, which often reduces error on reconstructed answers. We demonstrate the utility of ReM and GReM-LNN by applying them to improve existing private query answering mechanisms.

Efficient and Private Marginal Reconstruction with Local Non-Negativity

TL;DR

The paper addresses private reconstruction of marginal queries under differential privacy by introducing ReM, a residuals-to-marginals post-processing method that reconstructs workload marginals from residual measurements using a Kronecker-structured representation. It extends to Gaussian noise with GReM-LNN, which enforces local non-negativity to reduce reconstruction error, and proves efficiency through residual decomposition and pseudoinverse mappings. Empirical results show substantial accuracy gains and scalability improvements when integrating ReM and GReM-LNN into private query mechanisms such as ResidualPlanner and Scalable MWEM. The work provides practical algorithms and complexity analyses, enabling accurate, scalable private marginal reconstruction without exponential blow-up in high-dimensional domains.

Abstract

Differential privacy is the dominant standard for formal and quantifiable privacy and has been used in major deployments that impact millions of people. Many differentially private algorithms for query release and synthetic data contain steps that reconstruct answers to queries from answers to other queries that have been measured privately. Reconstruction is an important subproblem for such mechanisms to economize the privacy budget, minimize error on reconstructed answers, and allow for scalability to high-dimensional datasets. In this paper, we introduce a principled and efficient postprocessing method ReM (Residuals-to-Marginals) for reconstructing answers to marginal queries. Our method builds on recent work on efficient mechanisms for marginal query release, based on making measurements using a residual query basis that admits efficient pseudoinversion, which is an important primitive used in reconstruction. An extension GReM-LNN (Gaussian Residuals-to-Marginals with Local Non-negativity) reconstructs marginals under Gaussian noise satisfying consistency and non-negativity, which often reduces error on reconstructed answers. We demonstrate the utility of ReM and GReM-LNN by applying them to improve existing private query answering mechanisms.
Paper Structure (29 sections, 26 theorems, 36 equations, 11 figures, 2 tables, 7 algorithms)

This paper contains 29 sections, 26 theorems, 36 equations, 11 figures, 2 tables, 7 algorithms.

Key Result

Proposition 1

Let $R_{\mathcal{S}} = (R_\tau)_{\tau \in \mathcal{S}}$ be a combined workload of residual queries for all $\tau$ in a collection $\mathcal{S} \subseteq 2^{[d]}$, where the individual matrices $R_\tau$ are stacked vertically. The size of $R_\mathcal{S}$ is $m \times n$ where $m = \sum_{\tau \in \mat The matrix $A_{\gamma,\tau}$ has size $n_\gamma \times m_\tau$ and maps from the space of $\tau$-re

Figures (11)

  • Figure 1: Kronecker structure of workloads.
  • Figure 2: Average $\ell_1$ workload error on all 3-way marginals across five trials and privacy budgets $\epsilon \in \{ 0.1, 0.31, 1, 3.16, 10 \}$ and $\delta = 1 \times 10^{-9}$ for ResidualPlanner.
  • Figure 3: Average $\ell_1$ workload error on all 3-way marginals across five trials and privacy budgets $\epsilon \in \{ 0.1, 0.31, 1, 3.16, 10 \}$ and $\delta = 1 \times 10^{-9}$ for Scalable MWEM with 30 rounds of measurements.
  • Figure 4: Average $\ell_2$ workload error on all 3-way marginals across five trials and privacy budgets $\epsilon \in \{ 0.1, 0.31, 1, 3.16, 10 \}$ and $\delta = 1 \times 10^{-9}$ for ResidualPlanner.
  • Figure 5: Average $\ell_2$ workload error on all 3-way marginals across five trials and privacy budgets $\epsilon \in \{ 0.1, 0.31, 1, 3.16, 10 \}$ and $\delta = 1 \times 10^{-9}$ for Scalable MWEM with 30 rounds of measurements.
  • ...and 6 more figures

Theorems & Definitions (39)

  • Proposition 1
  • Definition 1
  • Proposition 2: Post-processing; Dwork14Algorithmic
  • Theorem 1
  • Lemma 1
  • Theorem 2
  • Theorem 3: Efficient pseudoinversion of marginal query matrix
  • Theorem 4
  • Proposition 3
  • Definition 2
  • ...and 29 more