Enhancing DPSGD via Per-Sample Momentum and Low-Pass Filtering
Xincheng Xu, Thilina Ranbaduge, Qing Wang, Thierry Rakotoarivelo, David Smith
TL;DR
This paper addresses the accuracy degradation of differentially private training with DPSGD by jointly reducing DP noise and clipping bias. It introduces DP-PMLF, which combines per-sample momentum for variance reduction with a post-processing linear low-pass filter to suppress high-frequency DP noise without extra privacy cost. Theoretical results establish improved convergence rates under DP and show that the low-pass filter and momentum terms reduce clipping bias and noise effects, respectively. Empirical evaluations on image and language tasks demonstrate a consistent privacy-utility improvement over state-of-the-art DPSGD variants, highlighting DP-PMLF's practical impact for private deep learning.
Abstract
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to train deep neural networks with formal privacy guarantees. However, the addition of differential privacy (DP) often degrades model accuracy by introducing both noise and bias. Existing techniques typically address only one of these issues, as reducing DP noise can exacerbate clipping bias and vice-versa. In this paper, we propose a novel method, \emph{DP-PMLF}, which integrates per-sample momentum with a low-pass filtering strategy to simultaneously mitigate DP noise and clipping bias. Our approach uses per-sample momentum to smooth gradient estimates prior to clipping, thereby reducing sampling variance. It further employs a post-processing low-pass filter to attenuate high-frequency DP noise without consuming additional privacy budget. We provide a theoretical analysis demonstrating an improved convergence rate under rigorous DP guarantees, and our empirical evaluations reveal that DP-PMLF significantly enhances the privacy-utility trade-off compared to several state-of-the-art DPSGD variants.
