The Impact of Generalization Techniques on the Interplay Among Privacy, Utility, and Fairness in Image Classification
Ahmad Hassanpour, Amir Zarei, Khawla Mallat, Anderson Santana de Oliveira, Bian Yang
TL;DR
The work investigates how generalization techniques interact with privacy and fairness in image classification under differential privacy. By evaluating DP-SAT alongside GN, OBS, WS, AM, and PA across CIFAR-10/100, synthetic biases (CIFAR-10S/100S), and CelebA, it demonstrates that DP-SAT can improve private accuracy (e.g., 81.11% on CIFAR-10 under $(8,10^{-5})$-DP) and generally enhances the privacy-utility balance compared to DP-SGD. However, the same techniques tend to amplify bias on biased data and real-world attributes, with higher MIA AUC and worsened HS in several settings; the Onion Effect further reveals persistent privacy vulnerabilities as outliers are removed. To address these trade-offs, the authors introduce Harmonic Score (HS) to jointly gauge accuracy, privacy leakage, and fairness, and validate findings in CelebA, highlighting practical implications for designing privacy-preserving, fair image classifiers. Overall, the work clarifies the promises and limits of generalization techniques in private learning and provides a roadmap for balancing competing objectives in real-world datasets.
Abstract
This study investigates the trade-offs between fairness, privacy, and utility in image classification using machine learning (ML). Recent research suggests that generalization techniques can improve the balance between privacy and utility. One focus of this work is sharpness-aware training (SAT) and its integration with differential privacy (DP-SAT) to further improve this balance. Additionally, we examine fairness in both private and non-private learning models trained on datasets with synthetic and real-world biases. We also measure the privacy risks involved in these scenarios by performing membership inference attacks (MIAs) and explore the consequences of eliminating high-privacy risk samples, termed outliers. Moreover, we introduce a new metric, named \emph{harmonic score}, which combines accuracy, privacy, and fairness into a single measure. Through empirical analysis using generalization techniques, we achieve an accuracy of 81.11\% under $(8, 10^{-5})$-DP on CIFAR-10, surpassing the 79.5\% reported by De et al. (2022). Moreover, our experiments show that memorization of training samples can begin before the overfitting point, and generalization techniques do not guarantee the prevention of this memorization. Our analysis of synthetic biases shows that generalization techniques can amplify model bias in both private and non-private models. Additionally, our results indicate that increased bias in training data leads to reduced accuracy, greater vulnerability to privacy attacks, and higher model bias. We validate these findings with the CelebA dataset, demonstrating that similar trends persist with real-world attribute imbalances. Finally, our experiments show that removing outlier data decreases accuracy and further amplifies model bias.
