Table of Contents
Fetching ...

The Inadequacy of Similarity-based Privacy Metrics: Privacy Attacks against "Truly Anonymous" Synthetic Datasets

Georgi Ganev, Emiliano De Cristofaro

TL;DR

The paper interrogates the privacy guarantees of commercial and research systems that release synthetic tabular data using similarity-based metrics (IMS, DCR, NNDR) and two filters (SF, OF). It shows these SBPMs lack theoretical guarantees, impose a binary and non-contrastive view of privacy, and can be bypassed through targeted attacks. The authors introduce DifferenceAttack for efficient membership/inference and ReconSyn, a black-box reconstruction attack that recovers 78–100% of train outliers across models and datasets, exploiting metric leakage rather than model memorization. Experimental results across multiple DP/non-DP models and datasets demonstrate that even end-to-end DP pipelines can be compromised if privacy metrics are used, highlighting the need for formal DP-based guarantees and DP-fied or alternative privacy mechanisms. The work urges practitioners and policymakers to move beyond ad-hoc SBPMs toward rigorous privacy pipelines to prevent real-world leakage in synthetic data deployments.

Abstract

Generative models producing synthetic data are meant to provide a privacy-friendly approach to releasing data. However, their privacy guarantees are only considered robust when models satisfy Differential Privacy (DP). Alas, this is not a ubiquitous standard, as many leading companies (and, in fact, research papers) use ad-hoc privacy metrics based on testing the statistical similarity between synthetic and real data. In this paper, we examine the privacy metrics used in real-world synthetic data deployments and demonstrate their unreliability in several ways. First, we provide counter-examples where severe privacy violations occur even if the privacy tests pass and instantiate accurate membership and attribute inference attacks with minimal cost. We then introduce ReconSyn, a reconstruction attack that generates multiple synthetic datasets that are considered private by the metrics but actually leak information unique to individual records. We show that ReconSyn recovers 78-100% of the outliers in the train data with only black-box access to a single fitted generative model and the privacy metrics. In the process, we show that applying DP only to the model does not mitigate this attack, as using privacy metrics breaks the end-to-end DP pipeline.

The Inadequacy of Similarity-based Privacy Metrics: Privacy Attacks against "Truly Anonymous" Synthetic Datasets

TL;DR

The paper interrogates the privacy guarantees of commercial and research systems that release synthetic tabular data using similarity-based metrics (IMS, DCR, NNDR) and two filters (SF, OF). It shows these SBPMs lack theoretical guarantees, impose a binary and non-contrastive view of privacy, and can be bypassed through targeted attacks. The authors introduce DifferenceAttack for efficient membership/inference and ReconSyn, a black-box reconstruction attack that recovers 78–100% of train outliers across models and datasets, exploiting metric leakage rather than model memorization. Experimental results across multiple DP/non-DP models and datasets demonstrate that even end-to-end DP pipelines can be compromised if privacy metrics are used, highlighting the need for formal DP-based guarantees and DP-fied or alternative privacy mechanisms. The work urges practitioners and policymakers to move beyond ad-hoc SBPMs toward rigorous privacy pipelines to prevent real-world leakage in synthetic data deployments.

Abstract

Generative models producing synthetic data are meant to provide a privacy-friendly approach to releasing data. However, their privacy guarantees are only considered robust when models satisfy Differential Privacy (DP). Alas, this is not a ubiquitous standard, as many leading companies (and, in fact, research papers) use ad-hoc privacy metrics based on testing the statistical similarity between synthetic and real data. In this paper, we examine the privacy metrics used in real-world synthetic data deployments and demonstrate their unreliability in several ways. First, we provide counter-examples where severe privacy violations occur even if the privacy tests pass and instantiate accurate membership and attribute inference attacks with minimal cost. We then introduce ReconSyn, a reconstruction attack that generates multiple synthetic datasets that are considered private by the metrics but actually leak information unique to individual records. We show that ReconSyn recovers 78-100% of the outliers in the train data with only black-box access to a single fitted generative model and the privacy metrics. In the process, we show that applying DP only to the model does not mitigate this attack, as using privacy metrics breaks the end-to-end DP pipeline.
Paper Structure (34 sections, 1 equation, 12 figures, 6 tables, 2 algorithms)

This paper contains 34 sections, 1 equation, 12 figures, 6 tables, 2 algorithms.

Figures (12)

  • Figure 1: Overview of the reconstruction success rate of ReconSyn and DifferenceAttack. ReconSyn reconstructs outliers from the train data with success varying by attack phase (SampleAttack and SearchAttack) and the number of calls. DifferenceAttack achieves a 100% success for membership and attribute inference ($k$ denotes the number of possible categories for the unknown attribute).
  • Figure 2: Train and test data, 2d Gauss.
  • Figure 3: Synth data reproducing all train outliers, 2d Gauss.
  • Figure 4: Applying SF resulting in "Swiss cheese," 2d Gauss.
  • Figure 5: ReconSyn Overview. The provider 1. splits the private data into train/test, 2. fits the generative model on train data, 3. generates synthetic data (privacy filters are applied), 4. runs the privacy metrics on synthetic data. The adversary can make API calls (i.e., black-box access) to the fitted generative model and privacy metrics. They a. generate synthetic datasets, b. run them through the privacy metrics to observe the pass/fail tests and scores (if tests pass), c. reconstruct train data outliers (through SampleAttack and SearchAttack).
  • ...and 7 more figures