Convergent Privacy Loss of Noisy-SGD without Convexity and Smoothness
Eli Chien, Pan Li
TL;DR
This paper proves that hidden-state Noisy-SGD on a bounded domain enjoys non-trivial convergent privacy loss even without convexity or smoothness, as long as the gradient is Hölder continuous. It achieves this by extending the shifted Rényi divergence framework with forward Wasserstein distance tracking and a Hölder reduction lemma, enabling tighter RDP bounds than prior work for both non-convex non-smooth and smooth strongly convex losses. The results cover full-batch and mini-batch regimes, including subsampling and shuffled minibatching, and provide practical guidance via optimizable shift allocations. The work advances privacy accounting for DP-SGD, offering a broader applicability to deep learning settings while highlighting important open problems and future directions for tighter bounds and practical implementations.
Abstract
We study the Differential Privacy (DP) guarantee of hidden-state Noisy-SGD algorithms over a bounded domain. Standard privacy analysis for Noisy-SGD assumes all internal states are revealed, which leads to a divergent R'enyi DP bound with respect to the number of iterations. Ye & Shokri (2022) and Altschuler & Talwar (2022) proved convergent bounds for smooth (strongly) convex losses, and raise open questions about whether these assumptions can be relaxed. We provide positive answers by proving convergent R'enyi DP bound for non-convex non-smooth losses, where we show that requiring losses to have Hölder continuous gradient is sufficient. We also provide a strictly better privacy bound compared to state-of-the-art results for smooth strongly convex losses. Our analysis relies on the improvement of shifted divergence analysis in multiple aspects, including forward Wasserstein distance tracking, identifying the optimal shifts allocation, and the H"older reduction lemma. Our results further elucidate the benefit of hidden-state analysis for DP and its applicability.
