On Theoretical Limits of Learning with Label Differential Privacy
Puning Zhao, Chuan Ma, Li Shen, Shaowei Wang, Rongfei Fan
TL;DR
The paper addresses the fundamental question of how much accuracy is attainable when learning with label differential privacy. By formulating minimax lower bounds via hypothesis testing and constructing matching algorithms, it reveals that relaxing DP to protect only labels yields substantial gains in the local model but only constant-factor gains in the central model. Across classification and regression under both bounded and heavy-tailed noise, it derives precise rates that depend on Hölder smoothness $\beta$, Tsybakov margin $\gamma$, dimension $d$, and privacy budget $\epsilon$, illustrating a clear divergence between local and central settings. The results have practical implications for privacy-preserving learning, showing when it is worth sacrificing feature privacy to boost accuracy and how to design efficient labeled DP mechanisms.
Abstract
Label differential privacy (DP) is designed for learning problems involving private labels and public features. While various methods have been proposed for learning under label DP, the theoretical limits remain largely unexplored. In this paper, we investigate the fundamental limits of learning with label DP in both local and central models for both classification and regression tasks, characterized by minimax convergence rates. We establish lower bounds by converting each task into a multiple hypothesis testing problem and bounding the test error. Additionally, we develop algorithms that yield matching upper bounds. Our results demonstrate that under label local DP (LDP), the risk has a significantly faster convergence rate than that under full LDP, i.e. protecting both features and labels, indicating the advantages of relaxing the DP definition to focus solely on labels. In contrast, under the label central DP (CDP), the risk is only reduced by a constant factor compared to full DP, indicating that the relaxation of CDP only has limited benefits on the performance.
