Smoothing ADMM for Non-convex and Non-smooth Hierarchical Federated Learning
Reza Mirzaeifard, Stefan Werner
TL;DR
The paper addresses optimization in hierarchical federated learning with non-convex and non-smooth objectives under client and cluster heterogeneity. It introduces Hierarchical Federated Smoothing ADMM (HFSAD), which blends smoothing upper-bounds with ADMM and auxiliary variables to enable asynchronous, multi-update distributed optimization across clusters and a global root. Regularizers are decomposed per cluster to support both consensus via total variation and personalization via non-convex penalties such as SCAD and MCP, while preserving proximal-friendly structure for efficient updates. Empirical results on SCAD-penalized robust phase retrieval show that HFSAD achieves faster convergence and higher accuracy than centralized baselines, highlighting its robustness and practicality for large-scale, heterogeneous FL scenarios.
Abstract
This paper presents a hierarchical federated learning (FL) framework that extends the alternating direction method of multipliers (ADMM) with smoothing techniques, tailored for non-convex and non-smooth objectives. Unlike traditional hierarchical FL methods, our approach supports asynchronous updates and multiple updates per iteration, enhancing adaptability to heterogeneous data and system settings. Additionally, we introduce a flexible mechanism to leverage diverse regularization functions at each layer, allowing customization to the specific prior information within each cluster and accommodating (possibly) non-smooth penalty objectives. Depending on the learning goal, the framework supports both consensus and personalization: the total variation norm can be used to enforce consensus across layers, while non-convex penalties such as minimax concave penalty (MCP) or smoothly clipped absolute deviation (SCAD) enable personalized learning. Experimental results demonstrate the superior convergence rates and accuracy of our method compared to conventional approaches, underscoring its robustness and versatility for a wide range of FL scenarios.
