DP-SGD-Global-Adapt-V2-S: Triad Improvements of Privacy, Accuracy and Fairness via Step Decay Noise Multiplier and Step Decay Upper Clipping Threshold
Sai Venkatesh Chilukoti, Md Imran Hossen, Liqun Shan, Vijay Srinivas Tida, Mahathir Mohammad Bappy, Wenmeng Tian, Xiai Hei
TL;DR
This work targets the trade-offs between privacy, accuracy, and fairness in DP-SGD by identifying convergence issues in DP-SGD-Global-Adapt and proposing DP-SGD-Global-Adapt-V2-S, which combines step-decay noise with step-decay clipping and DP-PSAC clipping when needed. The approach is formalized with decay schedulers (linear, time, and step) and a tCDP-based privacy accountant to track cumulative privacy loss across epochs. Empirical results across MNIST, CIFAR-10, CIFAR-100, unbalanced MNIST, and Thinwall show that step-decay noise yields faster convergence and higher utility, while enhancing fairness as measured by reduced privacy cost gaps. The work also provides explicit mathematical derivations for privacy budgets under various decay schemes and offers practical guidance for hyperparameter selection, contributing to robust, privacy-preserving training in sensitive domains such as additive manufacturing.
Abstract
Differentially Private Stochastic Gradient Descent (DP-SGD) has become a widely used technique for safeguarding sensitive information in deep learning applications. Unfortunately, DPSGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility and fairness. We observe that the latest DP-SGD-Global-Adapt's average gradient norm is the same throughout the training. Even when it is integrated with the existing linear decay noise multiplier, it has little or no advantage. Moreover, we notice that its upper clipping threshold increases exponentially towards the end of training, potentially impacting the models convergence. Other algorithms, DP-PSAC, Auto-S, DP-SGD-Global, and DP-F, have utility and fairness that are similar to or worse than DP-SGD, as demonstrated in experiments. To overcome these problems and improve utility and fairness, we developed the DP-SGD-Global-Adapt-V2-S. It has a step-decay noise multiplier and an upper clipping threshold that is also decayed step-wise. DP-SGD-Global-Adapt-V2-S with a privacy budget ($ε$) of 1 improves accuracy by 0.9795\%, 0.6786\%, and 4.0130\% in MNIST, CIFAR10, and CIFAR100, respectively. It also reduces the privacy cost gap ($π$) by 89.8332% and 60.5541% in unbalanced MNIST and Thinwall datasets, respectively. Finally, we develop mathematical expressions to compute the privacy budget using truncated concentrated differential privacy (tCDP) for DP-SGD-Global-Adapt-V2-T and DP-SGD-Global-Adapt-V2-S.
