Table of Contents
Fetching ...

You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks

Qiushi Li, Yan Zhang, Ju Ren, Qi Li, Yaoxue Zhang

TL;DR

This work tackles the challenge of protecting visual privacy in DNN training and inference where traditional differential privacy fails to conceal visual features without harming utility. It introduces VisualFeatureEntropy (VFE) as a task-agnostic privacy metric and VisualMixer (VIM), a noise-free, region-wise pixel shuffling method guided by VFE to obfuscate visual content. To sustain model training on obfuscated data, it also proposes ST-Adam, a stabilized optimization strategy that reduces gradient oscillations. Across multiple datasets and tasks, VisualMixer achieves privacy with an average accuracy loss around 2.35 percentage points and shows strong resistance to privacy-leakage attacks, while remaining compatible with federated learning and knowledge distillation. Overall, VisualMixer provides a practical, efficient approach to privacy-preserving vision that preserves utility and supports real-world DNN workflows.

Abstract

Image data have been extensively used in Deep Neural Network (DNN) tasks in various scenarios, e.g., autonomous driving and medical image analysis, which incurs significant privacy concerns. Existing privacy protection techniques are unable to efficiently protect such data. For example, Differential Privacy (DP) that is an emerging technique protects data with strong privacy guarantee cannot effectively protect visual features of exposed image dataset. In this paper, we propose a novel privacy-preserving framework VisualMixer that protects the training data of visual DNN tasks by pixel shuffling, while not injecting any noises. VisualMixer utilizes a new privacy metric called Visual Feature Entropy (VFE) to effectively quantify the visual features of an image from both biological and machine vision aspects. In VisualMixer, we devise a task-agnostic image obfuscation method to protect the visual privacy of data for DNN training and inference. For each image, it determines regions for pixel shuffling in the image and the sizes of these regions according to the desired VFE. It shuffles pixels both in the spatial domain and in the chromatic channel space in the regions without injecting noises so that it can prevent visual features from being discerned and recognized, while incurring negligible accuracy loss. Extensive experiments on real-world datasets demonstrate that VisualMixer can effectively preserve the visual privacy with negligible accuracy loss, i.e., at average 2.35 percentage points of model accuracy loss, and almost no performance degradation on model training.

You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks

TL;DR

This work tackles the challenge of protecting visual privacy in DNN training and inference where traditional differential privacy fails to conceal visual features without harming utility. It introduces VisualFeatureEntropy (VFE) as a task-agnostic privacy metric and VisualMixer (VIM), a noise-free, region-wise pixel shuffling method guided by VFE to obfuscate visual content. To sustain model training on obfuscated data, it also proposes ST-Adam, a stabilized optimization strategy that reduces gradient oscillations. Across multiple datasets and tasks, VisualMixer achieves privacy with an average accuracy loss around 2.35 percentage points and shows strong resistance to privacy-leakage attacks, while remaining compatible with federated learning and knowledge distillation. Overall, VisualMixer provides a practical, efficient approach to privacy-preserving vision that preserves utility and supports real-world DNN workflows.

Abstract

Image data have been extensively used in Deep Neural Network (DNN) tasks in various scenarios, e.g., autonomous driving and medical image analysis, which incurs significant privacy concerns. Existing privacy protection techniques are unable to efficiently protect such data. For example, Differential Privacy (DP) that is an emerging technique protects data with strong privacy guarantee cannot effectively protect visual features of exposed image dataset. In this paper, we propose a novel privacy-preserving framework VisualMixer that protects the training data of visual DNN tasks by pixel shuffling, while not injecting any noises. VisualMixer utilizes a new privacy metric called Visual Feature Entropy (VFE) to effectively quantify the visual features of an image from both biological and machine vision aspects. In VisualMixer, we devise a task-agnostic image obfuscation method to protect the visual privacy of data for DNN training and inference. For each image, it determines regions for pixel shuffling in the image and the sizes of these regions according to the desired VFE. It shuffles pixels both in the spatial domain and in the chromatic channel space in the regions without injecting noises so that it can prevent visual features from being discerned and recognized, while incurring negligible accuracy loss. Extensive experiments on real-world datasets demonstrate that VisualMixer can effectively preserve the visual privacy with negligible accuracy loss, i.e., at average 2.35 percentage points of model accuracy loss, and almost no performance degradation on model training.
Paper Structure (29 sections, 21 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 29 sections, 21 equations, 11 figures, 6 tables, 1 algorithm.

Figures (11)

  • Figure 1: Our work attempts to protect visual privacy through self-transformation guided by metric of semantic features. DP adds external noise to images, preventing adversaries from distinguishing whether a sample is present in the dataset. This approach is not intended for visual privacy protection in the context of dataset publication.
  • Figure 2: VFE of Obfuscated Images by Adding More Noises with DP. ($\sigma^2$ reflects the amount of noises added to the image dataset and feature map during training. ACC denotes the accuracy of the ShuffleNet model that is trained using the obfuscated images.)
  • Figure 3: VFE of Obfuscated Images under Different Shuffling Strategies. ($\text{WS}$ means the window size we used in the corresponding shuffling strategy. ACC denotes the accuracy of the ShuffleNet model that is trained using the obfuscated images.)
  • Figure 4: The Architecture and Working Process of VisualMixer
  • Figure 5: Comparing gradients of original and VisualMixed images.
  • ...and 6 more figures