Federated Smoothing Proximal Gradient for Quantile Regression with Non-Convex Penalties
Reza Mirzaeifard, Diyako Ghaderyan, Stefan Werner
TL;DR
This work tackles federated quantile regression on decentralized IoT data under privacy constraints, where the objective combines non-convex penalties (MCP/SCAD) with a non-smooth check loss. It introduces the Federated Smoothing Proximal Gradient (FSPG) method, which replaces non-smooth components with smooth surrogates and couples local gradient steps with a central proximal update, guided by a time-varying penalty and smoothing parameter that shrink over iterations. The authors prove convergence to a stationary point, establishing descent, subgradient bounds, and rates such as ||w^{(k+1)}-w^{(k)}||_2^2 = o(k^{-1-d}) and ||kappa^{(k+1)}||_2 = o(k^{-1/2+d/2}); they also demonstrate that the smoothing parameter mu -> 0 yields convergence of tilde g gradients to the original subgradients. Empirical results across synthetic and real datasets show that FSPG achieves faster convergence and more accurate sparse recovery than competing federated methods, with robust performance under varying sparsity and data distributions, highlighting its practical impact for reliable, privacy-preserving distributed learning.
Abstract
Distributed sensors in the internet-of-things (IoT) generate vast amounts of sparse data. Analyzing this high-dimensional data and identifying relevant predictors pose substantial challenges, especially when data is preferred to remain on the device where it was collected for reasons such as data integrity, communication bandwidth, and privacy. This paper introduces a federated quantile regression algorithm to address these challenges. Quantile regression provides a more comprehensive view of the relationship between variables than mean regression models. However, traditional approaches face difficulties when dealing with nonconvex sparse penalties and the inherent non-smoothness of the loss function. For this purpose, we propose a federated smoothing proximal gradient (FSPG) algorithm that integrates a smoothing mechanism with the proximal gradient framework, thereby enhancing both precision and computational speed. This integration adeptly handles optimization over a network of devices, each holding local data samples, making it particularly effective in federated learning scenarios. The FSPG algorithm ensures steady progress and reliable convergence in each iteration by maintaining or reducing the value of the objective function. By leveraging nonconvex penalties, such as the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD), the proposed method can identify and preserve key predictors within sparse models. Comprehensive simulations validate the robust theoretical foundations of the proposed algorithm and demonstrate improved estimation precision and reliable convergence.
