Privacy Preserving Properties of Vision Classifiers
Pirzada Suhail, Amit Sethi
TL;DR
This paper addresses the privacy risks of sharing pre-trained vision classifiers by systematically comparing three architectural families—MLPs, CNNs, and ViTs—under a network inversion attack. It introduces an extreme, end-to-end inversion framework that uses a vector-matrix conditioned generator to reconstruct training-like data from a single trained model with minimal priors, evaluating on MNIST, FashionMNIST, SVHN, and CIFAR-10. The study finds that MLPs exhibit the strongest memorization (highest reconstruction similarity), CNNs show the least, and ViTs fall in between, as measured by SSIM. These results highlight how architectural choices shape privacy leakage and provide guidance for designing privacy-aware vision systems in sensitive applications, while also outlining avenues for future defenses such as depth analysis, dataset complexity, and differential privacy integration.
Abstract
Vision classifiers are often trained on proprietary datasets containing sensitive information, yet the models themselves are frequently shared openly under the privacy-preserving assumption. Although these models are assumed to protect sensitive information in their training data, the extent to which this assumption holds for different architectures remains unexplored. This assumption is challenged by inversion attacks which attempt to reconstruct training data from model weights, exposing significant privacy vulnerabilities. In this study, we systematically evaluate the privacy-preserving properties of vision classifiers across diverse architectures, including Multi-Layer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and Vision Transformers (ViTs). Using network inversion-based reconstruction techniques, we assess the extent to which these architectures memorize and reveal training data, quantifying the relative ease of reconstruction across models. Our analysis highlights how architectural differences, such as input representation, feature extraction mechanisms, and weight structures, influence privacy risks. By comparing these architectures, we identify which are more resilient to inversion attacks and examine the trade-offs between model performance and privacy preservation, contributing to the development of secure and privacy-respecting machine learning models for sensitive applications. Our findings provide actionable insights into the design of secure and privacy-aware machine learning systems, emphasizing the importance of evaluating architectural decisions in sensitive applications involving proprietary or personal data.
