Learning with User-Level Local Differential Privacy
Puning Zhao, Li Shen, Rongfei Fan, Qingming Li, Huiwen Wu, Jiafei Wu, Zhe Liu
TL;DR
The paper tackles learning under user-level local differential privacy, showing that protecting an entire user’s data can yield mean-estimation and learning rates that are comparable to or even better than item-level LDP in the local setting. It introduces adaptive, ε-aware mechanisms (including two-stage mean estimators and Kashin/Hadamard-based transforms) and applies them to stochastic optimization, nonparametric classification, and regression, accompanied by information-theoretic lower bounds that establish minimax optimality up to logarithmic factors. A key finding is that, unlike central DP, user-level LDP in the local model often leads to similar convergence rates across privacy regimes and can be faster for heavy-tailed distributions. Collectively, the results provide practical, theoretically grounded tools for federated-like learning under user-level privacy constraints and highlight rich phase transitions driven by the privacy parameter ε.
Abstract
User-level privacy is important in distributed systems. Previous research primarily focuses on the central model, while the local models have received much less attention. Under the central model, user-level DP is strictly stronger than the item-level one. However, under the local model, the relationship between user-level and item-level LDP becomes more complex, thus the analysis is crucially different. In this paper, we first analyze the mean estimation problem and then apply it to stochastic optimization, classification, and regression. In particular, we propose adaptive strategies to achieve optimal performance at all privacy levels. Moreover, we also obtain information-theoretic lower bounds, which show that the proposed methods are minimax optimal up to logarithmic factors. Unlike the central DP model, where user-level DP always leads to slower convergence, our result shows that under the local model, the convergence rates are nearly the same between user-level and item-level cases for distributions with bounded support. For heavy-tailed distributions, the user-level rate is even faster than the item-level one.
