You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks
Qiushi Li, Yan Zhang, Ju Ren, Qi Li, Yaoxue Zhang
TL;DR
This work tackles the challenge of protecting visual privacy in DNN training and inference where traditional differential privacy fails to conceal visual features without harming utility. It introduces VisualFeatureEntropy (VFE) as a task-agnostic privacy metric and VisualMixer (VIM), a noise-free, region-wise pixel shuffling method guided by VFE to obfuscate visual content. To sustain model training on obfuscated data, it also proposes ST-Adam, a stabilized optimization strategy that reduces gradient oscillations. Across multiple datasets and tasks, VisualMixer achieves privacy with an average accuracy loss around 2.35 percentage points and shows strong resistance to privacy-leakage attacks, while remaining compatible with federated learning and knowledge distillation. Overall, VisualMixer provides a practical, efficient approach to privacy-preserving vision that preserves utility and supports real-world DNN workflows.
Abstract
Image data have been extensively used in Deep Neural Network (DNN) tasks in various scenarios, e.g., autonomous driving and medical image analysis, which incurs significant privacy concerns. Existing privacy protection techniques are unable to efficiently protect such data. For example, Differential Privacy (DP) that is an emerging technique protects data with strong privacy guarantee cannot effectively protect visual features of exposed image dataset. In this paper, we propose a novel privacy-preserving framework VisualMixer that protects the training data of visual DNN tasks by pixel shuffling, while not injecting any noises. VisualMixer utilizes a new privacy metric called Visual Feature Entropy (VFE) to effectively quantify the visual features of an image from both biological and machine vision aspects. In VisualMixer, we devise a task-agnostic image obfuscation method to protect the visual privacy of data for DNN training and inference. For each image, it determines regions for pixel shuffling in the image and the sizes of these regions according to the desired VFE. It shuffles pixels both in the spatial domain and in the chromatic channel space in the regions without injecting noises so that it can prevent visual features from being discerned and recognized, while incurring negligible accuracy loss. Extensive experiments on real-world datasets demonstrate that VisualMixer can effectively preserve the visual privacy with negligible accuracy loss, i.e., at average 2.35 percentage points of model accuracy loss, and almost no performance degradation on model training.
