Private and Fair Machine Learning: Revisiting the Disparate Impact of Differentially Private SGD

Lea Demelius; Dominik Kowald; Simone Kopeinik; Roman Kern; Andreas Trügler

Private and Fair Machine Learning: Revisiting the Disparate Impact of Differentially Private SGD

Lea Demelius, Dominik Kowald, Simone Kopeinik, Roman Kern, Andreas Trügler

TL;DR

This paper investigates how differential privacy via DPSGD affects fairness across multiple metrics and hyperparameter settings. It critically assesses the claim that optimizing hyperparameters directly for DP models can match the fairness of non-private models, revealing strong metric- and dataset-dependent variation and that tuning cannot reliably close the fairness gap. The authors also evaluate DPSGD-Global-Adapt, finding it not robust across hyperparameters, and discuss privacy leakage implications of hyperparameter tuning. Overall, the work highlights the need for careful metric selection, dataset-aware analysis, and development of private hyperparameter tuning methods to achieve private and fair ML in practice, with practical guidance on when DP tuning is beneficial and when it is not.

Abstract

Differential privacy (DP) is a prominent method for protecting information about individuals during data analysis. Training neural networks with differentially private stochastic gradient descent (DPSGD) influences the model's learning dynamics and, consequently, its output. This can affect the model's performance and fairness. While the majority of studies on the topic report a negative impact on fairness, it has recently been suggested that fairness levels comparable to non-private models can be achieved by optimizing hyperparameters for performance directly on differentially private models (rather than re-using hyperparameters from non-private models, as is common practice). In this work, we analyze the generalizability of this claim by 1) comparing the disparate impact of DPSGD on different performance metrics, and 2) analyzing it over a wide range of hyperparameter settings. We highlight that a disparate impact on one metric does not necessarily imply a disparate impact on another. Most importantly, we show that while optimizing hyperparameters directly on differentially private models does not mitigate the disparate impact of DPSGD reliably, it can still lead to improved utility-fairness trade-offs compared to re-using hyperparameters from non-private models. We stress, however, that any form of hyperparameter tuning entails additional privacy leakage, calling for careful considerations of how to balance privacy, utility and fairness. Finally, we extend our analyses to DPSGD-Global-Adapt, a variant of DPSGD designed to mitigate the disparate impact on accuracy, and conclude that this alternative may not be a robust solution with respect to hyperparameter choice.

Private and Fair Machine Learning: Revisiting the Disparate Impact of Differentially Private SGD

TL;DR

Abstract

Private and Fair Machine Learning: Revisiting the Disparate Impact of Differentially Private SGD

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)