GeoClip: Geometry-Aware Clipping for Differentially Private SGD
Atefeh Gilani, Naima Tasnim, Lalitha Sankar, Oliver Kosut
TL;DR
GeoClip tackles the DP-SGD clipping dilemma by projecting gradients into a geometry-aware basis using a transformation $M_t$ and shift $a_t$, so that noise is added along directions that preserve more utility. The authors provide a convergence guarantee and derive a closed-form solution for the optimal transformation, plus two practical estimation approaches (full-covariance moving average and streaming rank-$k$ PCA) that reuse privatized gradients without extra privacy cost. Empirically, GeoClip outperforms AdaClip, quantile-based clipping, and standard DP-SGD on synthetic, tabular, and image tasks under the same privacy budget, including transfer-learning fine-tuning and low-rank PCA variants to scale to high-dimensional settings. This framework offers faster convergence, reduced variance, and improved privacy-utility trade-offs, with practical scalability via low-rank approximations. Overall, GeoClip advances geometry-aware adaptive clipping for DP-SGD and demonstrates meaningful gains for privacy-sensitive machine learning applications.
Abstract
Differentially private stochastic gradient descent (DP-SGD) is the most widely used method for training machine learning models with provable privacy guarantees. A key challenge in DP-SGD is setting the per-sample gradient clipping threshold, which significantly affects the trade-off between privacy and utility. While recent adaptive methods improve performance by adjusting this threshold during training, they operate in the standard coordinate system and fail to account for correlations across the coordinates of the gradient. We propose GeoClip, a geometry-aware framework that clips and perturbs gradients in a transformed basis aligned with the geometry of the gradient distribution. GeoClip adaptively estimates this transformation using only previously released noisy gradients, incurring no additional privacy cost. We provide convergence guarantees for GeoClip and derive a closed-form solution for the optimal transformation that minimizes the amount of noise added while keeping the probability of gradient clipping under control. Experiments on both tabular and image datasets demonstrate that GeoClip consistently outperforms existing adaptive clipping methods under the same privacy budget.
