Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models

Weiheng Chai; Brian Testa; Huantao Ren; Asif Salekin; Senem Velipasalar

Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models

Weiheng Chai, Brian Testa, Huantao Ren, Asif Salekin, Senem Velipasalar

TL;DR

This work tackles practical privacy protection for image data by proposing feature map distortion (FMD), a black-box transfer-based perturbation that preserves human perceptibility and the accuracy of a designated authorized model while degrading unauthorized black-box models. The authors formalize the objective as $\mathcal{J} = \mathcal{L}' - \lambda\mathcal{L}_p$ with $\mathcal{L}' = \mathrm{MSE}(f_{feat}(x), f_{feat}(x^{*}))$ and $\mathcal{L}_p = \mathrm{CE}(f_{logits}(x^{*}), y_{gt})$, initializing perturbations within an $\ell_{\infty}$ budget $\epsilon$ and updating via gradient steps. Evaluations on ImageNet, Celeba-HQ, and AffectNet show near-perfect protection on the authorized model (100%, 99.92%, 99.67% respectively) with substantial degradation of unauthorized models (averages 11.97%, 6.63%, 55.51%), using six pretrained models and a consistent generation setup ($\epsilon=16$, $\alpha=4$, $N=100$). The study also explores cross-task transferability, revealing that while perturbations can degrade models across tasks (e.g., object detection, ethnicity classification), transferability depends on task and model characteristics. Ablation analyses highlight the importance of layer choice, parameter settings, and the image-quality trade-off, guiding practical deployment and future improvements in cross-task robustness.

Abstract

Deep neural networks are extensively applied to real-world tasks, such as face recognition and medical image classification, where privacy and data protection are critical. Image data, if not protected, can be exploited to infer personal or contextual information. Existing privacy preservation methods, like encryption, generate perturbed images that are unrecognizable to even humans. Adversarial attack approaches prohibit automated inference even for authorized stakeholders, limiting practical incentives for commercial and widespread adaptation. This pioneering study tackles an unexplored practical privacy preservation use case by generating human-perceivable images that maintain accurate inference by an authorized model while evading other unauthorized black-box models of similar or dissimilar objectives, and addresses the previous research gaps. The datasets employed are ImageNet, for image classification, Celeba-HQ dataset, for identity classification, and AffectNet, for emotion classification. Our results show that the generated images can successfully maintain the accuracy of a protected model and degrade the average accuracy of the unauthorized black-box models to 11.97%, 6.63%, and 55.51% on ImageNet, Celeba-HQ, and AffectNet datasets, respectively.

Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models

TL;DR

with

and

, initializing perturbations within an

budget

and updating via gradient steps. Evaluations on ImageNet, Celeba-HQ, and AffectNet show near-perfect protection on the authorized model (100%, 99.92%, 99.67% respectively) with substantial degradation of unauthorized models (averages 11.97%, 6.63%, 55.51%), using six pretrained models and a consistent generation setup (

). The study also explores cross-task transferability, revealing that while perturbations can degrade models across tasks (e.g., object detection, ethnicity classification), transferability depends on task and model characteristics. Ablation analyses highlight the importance of layer choice, parameter settings, and the image-quality trade-off, guiding practical deployment and future improvements in cross-task robustness.

Abstract

Paper Structure (16 sections, 5 equations, 7 figures, 11 tables, 1 algorithm)

This paper contains 16 sections, 5 equations, 7 figures, 11 tables, 1 algorithm.

Introduction
Related Work
Protect & Attack Model
Methodology
Experimental Results
Results on the ImageNet Dataset
Results on Celeba-HQ dataset
Results on the AffectNet dataset
Cross-task feasibility study
Analysis of Results
Ablation Studies
Layer Selection for FMD
Parameter Setting for FMD
Image Quality and Protection Performance
Performance Analysis of Task Complexity
...and 1 more sections

Figures (7)

Figure 1: Proposed feature-based privacy protection. $x$ denotes the input image, $x^*_i$ and $x^*_{i+1}$ denote the generated image after $i^{th}$ and $i+1^{th}$ iteration, respectively.
Figure 2: Examples of object detection results from Mobile_Net_V3_SSDLite with the original and generated images showing the false (first row) and missed detections (2nd and 3rd rows).
Figure 3: (A) Example image and a protected variant generated by applying FMD on VGG16. (B) Grad-CAM heat maps for these images for VGG16. (C) Grad-CAM heat maps for ResNet18.
Figure 4: (A) Final convolution layers in VGG16. (B) Final convolution layers in ResNet18.
Figure 5: Highest softmax values for the authorized and unauthorized models on the original and protected images.
...and 2 more figures

Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models

TL;DR

Abstract

Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)