Table of Contents
Fetching ...

Training Data Reconstruction: Privacy due to Uncertainty?

Christina Runkel, Kanchana Vaishnavi Gandikota, Jonas Geiping, Carola-Bibiane Schönlieb, Michael Moeller

TL;DR

This work studies the privacy risks of reconstructing training data from neural network parameters by casting the task as a bilevel optimization: minimize $l(\theta^*, \theta(x,y))$ subject to $\theta(x,y)$ solving the lower-level training problem. It demonstrates that reconstruction outcomes depend critically on how the input $x$ is initialised, with random initialisations producing non-ground-truth, plausible-looking images and near-ground-truth initialisations capable of recovering actual samples, though membership cannot be certified with certainty. The authors compare a bilevel approach to the DecoReco method, finding similar initialization sensitivity and showing that both formulations exhibit an energy landscape with many local minima. Collectively, the results highlight a nuanced privacy implication: even able reconstructions may not reliably reveal whether a given image was part of the training data, complicating membership claims in practical settings.

Abstract

Being able to reconstruct training data from the parameters of a neural network is a major privacy concern. Previous works have shown that reconstructing training data, under certain circumstances, is possible. In this work, we analyse such reconstructions empirically and propose a new formulation of the reconstruction as a solution to a bilevel optimisation problem. We demonstrate that our formulation as well as previous approaches highly depend on the initialisation of the training images $x$ to reconstruct. In particular, we show that a random initialisation of $x$ can lead to reconstructions that resemble valid training samples while not being part of the actual training dataset. Thus, our experiments on affine and one-hidden layer networks suggest that when reconstructing natural images, yet an adversary cannot identify whether reconstructed images have indeed been part of the set of training samples.

Training Data Reconstruction: Privacy due to Uncertainty?

TL;DR

This work studies the privacy risks of reconstructing training data from neural network parameters by casting the task as a bilevel optimization: minimize subject to solving the lower-level training problem. It demonstrates that reconstruction outcomes depend critically on how the input is initialised, with random initialisations producing non-ground-truth, plausible-looking images and near-ground-truth initialisations capable of recovering actual samples, though membership cannot be certified with certainty. The authors compare a bilevel approach to the DecoReco method, finding similar initialization sensitivity and showing that both formulations exhibit an energy landscape with many local minima. Collectively, the results highlight a nuanced privacy implication: even able reconstructions may not reliably reveal whether a given image was part of the training data, complicating membership claims in practical settings.

Abstract

Being able to reconstruct training data from the parameters of a neural network is a major privacy concern. Previous works have shown that reconstructing training data, under certain circumstances, is possible. In this work, we analyse such reconstructions empirically and propose a new formulation of the reconstruction as a solution to a bilevel optimisation problem. We demonstrate that our formulation as well as previous approaches highly depend on the initialisation of the training images to reconstruct. In particular, we show that a random initialisation of can lead to reconstructions that resemble valid training samples while not being part of the actual training dataset. Thus, our experiments on affine and one-hidden layer networks suggest that when reconstructing natural images, yet an adversary cannot identify whether reconstructed images have indeed been part of the set of training samples.

Paper Structure

This paper contains 12 sections, 9 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: Reconstructions of training samples (black bounding box (BB)) and their nearest neighbour of the dataset (gray BB) for a random init. of $x$.
  • Figure 2: Reconstructions of training samples (black BB) together with their nearest neighbour of the dataset (gray BB) for a ground truth init. of $x$.
  • Figure 3: Reconstruction of training samples for affine classifier, CIFAR partition init. for $x$.
  • Figure 4: Reconstruction of training samples for one-hidden layer classifier, CIFAR partition init. for $x$.
  • Figure 5: DecoReco reconstructions (black BB) for a random init. of $x$ without finetuning the variance of the distribution vs nearest neighbour of training set (gray BB).
  • ...and 7 more figures