Table of Contents
Fetching ...

PEEL the Layers and Find Yourself: Revisiting Inference-time Data Leakage for Residual Neural Networks

Huzaifa Arif, Keerthiram Murugesan, Payel Das, Alex Gittens, Pin-Yu Chen

TL;DR

PEEL reveals a pronounced inference-time data leakage risk in residual networks by performing layer-wise backward feature inversion, reconstructing private inputs from intermediate residual outputs without relying on training data priors. The method solves a non-convex optimization per residual block and progressively inverts the network, achieving high-fidelity recoveries on facial, chest X-ray, and ImageNet data, often outperforming GAN-based baselines in exactness (lower MSE and KNN Distance). The findings highlight privacy concerns for HbC scenarios, split learning, and cached intermediate representations, suggesting the need for robust privacy-preserving mechanisms when using residual architectures. The work also demonstrates PEEL’s applicability across various architectures and datasets, while outlining factors that modulate leakage, such as pooling, activation choices, and weight initialization.

Abstract

This paper explores inference-time data leakage risks of deep neural networks (NNs), where a curious and honest model service provider is interested in retrieving users' private data inputs solely based on the model inference results. Particularly, we revisit residual NNs due to their popularity in computer vision and our hypothesis that residual blocks are a primary cause of data leakage owing to the use of skip connections. By formulating inference-time data leakage as a constrained optimization problem, we propose a novel backward feature inversion method, \textbf{PEEL}, which can effectively recover block-wise input features from the intermediate output of residual NNs. The surprising results in high-quality input data recovery can be explained by the intuition that the output from these residual blocks can be considered as a noisy version of the input and thus the output retains sufficient information for input recovery. We demonstrate the effectiveness of our layer-by-layer feature inversion method on facial image datasets and pre-trained classifiers. Our results show that PEEL outperforms the state-of-the-art recovery methods by an order of magnitude when evaluated by mean squared error (MSE). The code is available at \href{https://github.com/Huzaifa-Arif/PEEL}{https://github.com/Huzaifa-Arif/PEEL}

PEEL the Layers and Find Yourself: Revisiting Inference-time Data Leakage for Residual Neural Networks

TL;DR

PEEL reveals a pronounced inference-time data leakage risk in residual networks by performing layer-wise backward feature inversion, reconstructing private inputs from intermediate residual outputs without relying on training data priors. The method solves a non-convex optimization per residual block and progressively inverts the network, achieving high-fidelity recoveries on facial, chest X-ray, and ImageNet data, often outperforming GAN-based baselines in exactness (lower MSE and KNN Distance). The findings highlight privacy concerns for HbC scenarios, split learning, and cached intermediate representations, suggesting the need for robust privacy-preserving mechanisms when using residual architectures. The work also demonstrates PEEL’s applicability across various architectures and datasets, while outlining factors that modulate leakage, such as pooling, activation choices, and weight initialization.

Abstract

This paper explores inference-time data leakage risks of deep neural networks (NNs), where a curious and honest model service provider is interested in retrieving users' private data inputs solely based on the model inference results. Particularly, we revisit residual NNs due to their popularity in computer vision and our hypothesis that residual blocks are a primary cause of data leakage owing to the use of skip connections. By formulating inference-time data leakage as a constrained optimization problem, we propose a novel backward feature inversion method, \textbf{PEEL}, which can effectively recover block-wise input features from the intermediate output of residual NNs. The surprising results in high-quality input data recovery can be explained by the intuition that the output from these residual blocks can be considered as a noisy version of the input and thus the output retains sufficient information for input recovery. We demonstrate the effectiveness of our layer-by-layer feature inversion method on facial image datasets and pre-trained classifiers. Our results show that PEEL outperforms the state-of-the-art recovery methods by an order of magnitude when evaluated by mean squared error (MSE). The code is available at \href{https://github.com/Huzaifa-Arif/PEEL}{https://github.com/Huzaifa-Arif/PEEL}

Paper Structure

This paper contains 24 sections, 13 equations, 21 figures, 10 tables, 1 algorithm.

Figures (21)

  • Figure 1: Data leakage problem setup: Party A (e.g., an enterprise user) accesses an API to make predictions on its private data. Party B (e.g., a model service provider) has access to the model predictions and model weights. It then uses PEEL to reconstruct the private data of Party A. PEEL is an optimization-based feature inversion method, which features block-wise input recovery from the intermediate output of residual neural networks.
  • Figure 2: Comparison of the embedding inversion method2015_CVPR and PEEL (ours). Deeper layers are hard to invert in ResNets for 2015_CVPR, whereas PEEL shows good reconstruction performance.
  • Figure 3: The structure of the preactivation residual blocks following identity_mapping. Here ${\mathbf{x}}$ and ${\mathbf{y}}$ are the input and output, respectively, of the residual block. ResNet architectures consist of multiple residual blocks chained in sequence.
  • Figure 4: PEEL in effect on ResNet 18 architecture. Figure \ref{['PEEL_untrained']}/Figure \ref{['PEEL_trained']} shows reconstruction on a randomly-initialized/pretrained ResNet18. The intermediate results from each layer visualize the features in a small subset of the channels. The top/bottom half of each figure shows the ground-truth/ reconstructed feature maps using PEEL.
  • Figure 5: Reconstruction of a 5-channel, 8-by-8 image from the output of a Residual Block using a CIFAR-10 data sample. The residual error in the image recovered from solving \ref{['resblock_inversion']} using PyGRANSO has $\ell_2$-norm 3.64 and 1.14 % relative error.
  • ...and 16 more figures