Spurious Privacy Leakage in Neural Networks
Chenxiang Zhang, Jun Pang, Sjouke Mauw
TL;DR
This work addresses privacy risks arising from spurious correlations in real-world data by introducing spurious privacy leakage and a corresponding group privacy disparity under membership inference attacks. It applies LiRA-based MIAs to five real-world spurious datasets, evaluates spurious robust methods (DRO, DFR) and differential privacy, and analyzes multiple architecture families to understand privacy dynamics. The key findings show consistent subgroup privacy disparities that are not reliably mitigated by current robust training techniques, and that while differential privacy can improve worst-group protection, it often harms utility; architecture and pretraining also influence privacy auditing. These results underscore the need for fine-grained, group-level privacy auditing in biased data settings and point to directions for improving defenses and auditing practices in practical deployments.
Abstract
Neural networks trained on real-world data often exhibit biases while simultaneously being vulnerable to privacy attacks aimed at extracting sensitive information. Despite extensive research on each problem individually, their intersection remains poorly understood. In this work, we investigate the privacy impact of spurious correlation bias. We introduce \emph{spurious privacy leakage}, a phenomenon in which spurious groups are significantly more vulnerable to privacy attacks than non-spurious groups. We observe that privacy disparity between groups increases in tasks with simpler objectives (e.g. fewer classes) due to spurious features. Counterintuitively, we demonstrate that spurious robust methods, designed to reduce spurious bias, fail to mitigate privacy disparity. Our analysis reveals that this occurs because robust methods can reduce reliance on spurious features for prediction, but do not prevent their memorization during training. Finally, we systematically compare the privacy of different model architectures trained with spurious data, demonstrating that, contrary to previous work, architectural choice can affect privacy evaluation.
