Real-Time Privacy Risk Measurement with Privacy Tokens for Gradient Leakage
Jiayang Meng, Tao Huang, Hong Chen, Xin Shi, Qingyu Huang, Chen Hou
TL;DR
The paper addresses gradient leakage during training in privacy-sensitive domains by introducing privacy tokens—embeddings of intermediate gradients—and using Mutual Information between training data and gradients, estimated with a Mutual Information Neural Estimator, to quantify leakage in real time. It formalizes $I(\mathcal{X}; G)$ via the Donsker-Varadhan representation and designs a MINE-based network that fuses data features from intermediate layers with gradient tokens (via Autoencoder or Transformer) to estimate leakage continuously. Through experiments on CIFAR-10 and CelebA-HQ with multiple architectures, the authors show that MI differences between matched and mismatched data–gradient pairs correlate with the success of gradient-based attacks, validating the approach as a proactive privacy monitoring tool. The results suggest practical implications for adaptive privacy budgets and reinforce the potential of privacy tokens for safer deployment of deep learning in sensitive applications.
Abstract
The widespread deployment of deep learning models in privacy-sensitive domains has amplified concerns regarding privacy risks, particularly those stemming from gradient leakage during training. Current privacy assessments primarily rely on post-training attack simulations. However, these methods are inherently reactive, unable to encompass all potential attack scenarios, and often based on idealized adversarial assumptions. These limitations underscore the need for proactive approaches to privacy risk assessment during the training process. To address this gap, we propose the concept of privacy tokens, which are derived directly from private gradients during training. Privacy tokens encapsulate gradient features and, when combined with data features, offer valuable insights into the extent of private information leakage from training data, enabling real-time measurement of privacy risks without relying on adversarial attack simulations. Additionally, we employ Mutual Information (MI) as a robust metric to quantify the relationship between training data and gradients, providing precise and continuous assessments of privacy leakage throughout the training process. Extensive experiments validate our framework, demonstrating the effectiveness of privacy tokens and MI in identifying and quantifying privacy risks. This proactive approach marks a significant advancement in privacy monitoring, promoting the safer deployment of deep learning models in sensitive applications.
