Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation
Janvi Thakkar, Giulio Zizzo, Sergio Maffeis
TL;DR
The paper addresses the challenge of defending ML models against simultaneous privacy and robustness attacks by empirically evaluating DP-Adv, which combines differential privacy with adversarial training. It benchmarks DP-Adv against no-defense, adversarial training only, and DP training across MNIST, Fashion-MNIST, and CIFAR-10 using membership inference attacks to assess both individual and group privacy, reporting a DP-like privacy cost $(\epsilon, \delta)$ under dynamic training. The results show that DP-Adv offers privacy comparable to private DP models, with limited group leakage despite the dataset changing each epoch, though there are notable utility costs on more complex datasets. The work highlights the need for formal privacy guarantees in dynamic training paradigms and motivates further analysis of privacy properties under constantly evolving optimization workflows.
Abstract
Malicious adversaries can attack machine learning models to infer sensitive information or damage the system by launching a series of evasion attacks. Although various work addresses privacy and security concerns, they focus on individual defenses, but in practice, models may undergo simultaneous attacks. This study explores the combination of adversarial training and differentially private training to defend against simultaneous attacks. While differentially-private adversarial training, as presented in DP-Adv, outperforms the other state-of-the-art methods in performance, it lacks formal privacy guarantees and empirical validation. Thus, in this work, we benchmark the performance of this technique using a membership inference attack and empirically show that the resulting approach is as private as non-robust private models. This work also highlights the need to explore privacy guarantees in dynamic training paradigms.
