Data Reconstruction: When You See It and When You Don't

Edith Cohen; Haim Kaplan; Yishay Mansour; Shay Moran; Kobbi Nissim; Uri Stemmer; Eliad Tsfadia

Data Reconstruction: When You See It and When You Don't

Edith Cohen, Haim Kaplan, Yishay Mansour, Shay Moran, Kobbi Nissim, Uri Stemmer, Eliad Tsfadia

TL;DR

This work tackles the problem of defining reconstruction attacks, arguing that no single universal definition suffices and proposing a sandwich strategy built around two questions about protection and attack indications. It introduces Narcissus resiliency, a self-referential security paradigm that compares an attacker's success on real data to its success when blind to the data, thereby avoiding fixed baselines and aligning with cryptographic and privacy notions such as DP and MI. The paper also formalizes extraction from outputs via Kolmogorov complexity, showing that Narcissus resiliency prevents non-trivial extraction and relating this framework to well-known concepts like one-way functions and encryption, while discussing practical verification and limitations. Through expressiveness results, it demonstrates how MI and predicate singling out fit as specific instantiations of Narcissus resiliency, and it surveys related works in memorization, DP, and legal privacy concepts. Overall, Narcissus resiliency offers a unified, adaptable lens for reasoning about reconstruction attacks and the protection they require, with clear implications for designing and auditing privacy-preserving mechanisms.

Abstract

We revisit the fundamental question of formally defining what constitutes a reconstruction attack. While often clear from the context, our exploration reveals that a precise definition is much more nuanced than it appears, to the extent that a single all-encompassing definition may not exist. Thus, we employ a different strategy and aim to "sandwich" the concept of reconstruction attacks by addressing two complementing questions: (i) What conditions guarantee that a given system is protected against such attacks? (ii) Under what circumstances does a given attack clearly indicate that a system is not protected? More specifically, * We introduce a new definitional paradigm -- Narcissus Resiliency -- to formulate a security definition for protection against reconstruction attacks. This paradigm has a self-referential nature that enables it to circumvent shortcomings of previously studied notions of security. Furthermore, as a side-effect, we demonstrate that Narcissus resiliency captures as special cases multiple well-studied concepts including differential privacy and other security notions of one-way functions and encryption schemes. * We formulate a link between reconstruction attacks and Kolmogorov complexity. This allows us to put forward a criterion for evaluating when such attacks are convincingly successful.

Data Reconstruction: When You See It and When You Don't

TL;DR

Abstract

Paper Structure (39 sections, 9 theorems, 62 equations, 2 algorithms)

This paper contains 39 sections, 9 theorems, 62 equations, 2 algorithms.

Introduction
Protecting against reconstruction
Towards a new definition
A new Definitional paradigm: Narcissus resiliency -- an adversary trying to beat itself in its own game
Identifying reconstruction
Defining Extraction via Kolmogorov Complexity
Narcissus-Resiliency prevents non-trivial extraction of training data
Verifying the validity of reconstruction attacks
Additional related works
Memorization
Computational Differential Privacy
Membership Inference
Reconstruction
Formalizing legal concepts of privacy
Expressiveness of Narcissus resiliency
...and 24 more sections

Key Result

Theorem 2.1

Let ${\cal M}:{\cal X}^n\rightarrow Y$ be an algorithm and let ${\cal D}$ be a distribution over ${\cal X}$. Then ${\cal M}$ is $(\delta,{\cal D})$-MI-secure if and only if it is $(0,\delta,\{{\cal D}^n\})$-$R_{\rm MI}$-Narcissus-resilient.

Theorems & Definitions (56)

Definition 1: balle2022reconstructingcummings2024attaxonomy
Definition 2
Definition 3: Narcissus resiliency
Example 1
Example 2
Definition 4: CarliniLargeModels21CarliniDiffusion23
Definition 5: $K_{{\cal L}}$-Complexity
Definition 6: Our extraction definition, informal
Definition 7: Resilience to Membership Inference, shokri2017membershipyeom2018privacy
Definition 8
...and 46 more

Data Reconstruction: When You See It and When You Don't

TL;DR

Abstract

Data Reconstruction: When You See It and When You Don't

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (56)