DP-FedPGN: Finding Global Flat Minima for Differentially Private Federated Learning via Penalizing Gradient Norm

Junkang Liu; Yuxuan Tian; Fanhua Shang; Yuanyuan Liu; Hongying Liu; Junchao Zhou; Daorui Ding

DP-FedPGN: Finding Global Flat Minima for Differentially Private Federated Learning via Penalizing Gradient Norm

Junkang Liu, Yuxuan Tian, Fanhua Shang, Yuanyuan Liu, Hongying Liu, Junchao Zhou, Daorui Ding

TL;DR

The paper tackles privacy-induced degradation in client-level differential privacy federated learning (CL-DPFL), where gradient clipping and DP noise create sharp loss landscapes and poor generalization. It introduces DP-FedPGN, which adds a global gradient-norm penalty to steer optimization toward global flat minima, and an optional DP-FedPGN-LS variant with Laplacian smoothing to further flatten the landscape. The authors provide convergence, sensitivity, and privacy analyses under Rényi DP, and demonstrate substantial empirical gains across CNNs, Vision Transformers, and RoBERTa on six tasks under non-IID data, with faster convergence and improved privacy-utility trade-offs. The approach is practical for large-scale, heterogeneous data in DPFL and offers a principled way to mitigate DP-related degradation while preserving performance.

Abstract

To prevent inference attacks in Federated Learning (FL) and reduce the leakage of sensitive information, Client-level Differentially Private Federated Learning (CL-DPFL) is widely used. However, current CL-DPFL methods usually result in sharper loss landscapes, which leads to a decrease in model generalization after differential privacy protection. By using Sharpness Aware Minimization (SAM), the current popular federated learning methods are to find a local flat minimum value to alleviate this problem. However, the local flatness may not reflect the global flatness in CL-DPFL. Therefore, to address this issue and seek global flat minima of models, we propose a new CL-DPFL algorithm, DP-FedPGN, in which we introduce a global gradient norm penalty to the local loss to find the global flat minimum. Moreover, by using our global gradient norm penalty, we not only find a flatter global minimum but also reduce the locally updated norm, which means that we further reduce the error of gradient clipping. From a theoretical perspective, we analyze how DP-FedPGN mitigates the performance degradation caused by DP. Meanwhile, the proposed DP-FedPGN algorithm eliminates the impact of data heterogeneity and achieves fast convergence. We also use Rényi DP to provide strict privacy guarantees and provide sensitivity analysis for local updates. Finally, we conduct effectiveness tests on both ResNet and Transformer models, and achieve significant improvements in six visual and natural language processing tasks compared to existing state-of-the-art algorithms. The code is available at https://github.com/junkangLiu0/DP-FedPGN

DP-FedPGN: Finding Global Flat Minima for Differentially Private Federated Learning via Penalizing Gradient Norm

TL;DR

Abstract

DP-FedPGN: Finding Global Flat Minima for Differentially Private Federated Learning via Penalizing Gradient Norm

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (17)