Table of Contents
Fetching ...

Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures

Sayanton V. Dibbo, Adam Breuer, Juston Moore, Michael Teti

TL;DR

The paper addresses the privacy risk of model inversion attacks that reconstruct training data from model outputs. It introduces Sparse Coding Architecture (SCA), which interleaves sparse coding layers with dense layers to prune private information while preserving classification accuracy, implemented via Locally Competitive Algorithm-based sparse coding. Across five datasets and three threat models, SCA consistently degrades reconstruction quality (PSNR, SSIM, FID) with competitive or improved accuracy compared to nine baselines, and demonstrates stability without parameter tuning. The work reveals a meaningful link between sparse coding theory and privacy protections in ML, provides a practical PyTorch codebase for replication, and outlines directions for scaling and theoretical guarantees. Overall, SCA offers a principled, data-efficient defense against model inversion with broad applicability to privacy-sensitive domains.

Abstract

Recent model inversion attack algorithms permit adversaries to reconstruct a neural network's private and potentially sensitive training data by repeatedly querying the network. In this work, we develop a novel network architecture that leverages sparse-coding layers to obtain superior robustness to this class of attacks. Three decades of computer science research has studied sparse coding in the context of image denoising, object recognition, and adversarial misclassification settings, but to the best of our knowledge, its connection to state-of-the-art privacy vulnerabilities remains unstudied. In this work, we hypothesize that sparse coding architectures suggest an advantageous means to defend against model inversion attacks because they allow us to control the amount of irrelevant private information encoded by a network in a manner that is known to have little effect on classification accuracy. Specifically, compared to networks trained with a variety of state-of-the-art defenses, our sparse-coding architectures maintain comparable or higher classification accuracy while degrading state-of-the-art training data reconstructions by factors of 1.1 to 18.3 across a variety of reconstruction quality metrics (PSNR, SSIM, FID). This performance advantage holds across 5 datasets ranging from CelebA faces to medical images and CIFAR-10, and across various state-of-the-art SGD-based and GAN-based inversion attacks, including Plug-&-Play attacks. We provide a cluster-ready PyTorch codebase to promote research and standardize defense evaluations.

Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures

TL;DR

The paper addresses the privacy risk of model inversion attacks that reconstruct training data from model outputs. It introduces Sparse Coding Architecture (SCA), which interleaves sparse coding layers with dense layers to prune private information while preserving classification accuracy, implemented via Locally Competitive Algorithm-based sparse coding. Across five datasets and three threat models, SCA consistently degrades reconstruction quality (PSNR, SSIM, FID) with competitive or improved accuracy compared to nine baselines, and demonstrates stability without parameter tuning. The work reveals a meaningful link between sparse coding theory and privacy protections in ML, provides a practical PyTorch codebase for replication, and outlines directions for scaling and theoretical guarantees. Overall, SCA offers a principled, data-efficient defense against model inversion with broad applicability to privacy-sensitive domains.

Abstract

Recent model inversion attack algorithms permit adversaries to reconstruct a neural network's private and potentially sensitive training data by repeatedly querying the network. In this work, we develop a novel network architecture that leverages sparse-coding layers to obtain superior robustness to this class of attacks. Three decades of computer science research has studied sparse coding in the context of image denoising, object recognition, and adversarial misclassification settings, but to the best of our knowledge, its connection to state-of-the-art privacy vulnerabilities remains unstudied. In this work, we hypothesize that sparse coding architectures suggest an advantageous means to defend against model inversion attacks because they allow us to control the amount of irrelevant private information encoded by a network in a manner that is known to have little effect on classification accuracy. Specifically, compared to networks trained with a variety of state-of-the-art defenses, our sparse-coding architectures maintain comparable or higher classification accuracy while degrading state-of-the-art training data reconstructions by factors of 1.1 to 18.3 across a variety of reconstruction quality metrics (PSNR, SSIM, FID). This performance advantage holds across 5 datasets ranging from CelebA faces to medical images and CIFAR-10, and across various state-of-the-art SGD-based and GAN-based inversion attacks, including Plug-&-Play attacks. We provide a cluster-ready PyTorch codebase to promote research and standardize defense evaluations.
Paper Structure (35 sections, 3 equations, 9 figures, 15 tables)

This paper contains 35 sections, 3 equations, 9 figures, 15 tables.

Figures (9)

  • Figure 1: Pipeline of neuron (membrane potential) dynamics in Sparse Coding Layer (SCL) with lateral competitions.
  • Figure 2: Architecture of SCA.
  • Figure 3: Experiments set 1: Qualitative comparisons among actual and reconstructed images (Plug-&-Play Attack struppek2022plug) under SCA & baselines on hi-res CelebA dataset.
  • Figure 4: Experiments set 2: Qualitative comparisons among actual & reconstructed images (end-to-end setting) under SCA & baselines on the Medical MNIST dataset.
  • Figure 5: Stability of SCA & baselines' defense performance (mean $\pm$ std. dev.) of PSNR and FID across multiple runs on CelebA and Medical MNIST.
  • ...and 4 more figures