Analysis of Total Variation Minimization for Clustered Federated Learning
A. Jung
TL;DR
The paper addresses statistical heterogeneity in federated learning by adopting clustered federated learning (CFL) with generalized total variation minimization (GTVMin) on a similarity graph. It formalizes CFL over a network, introduces a clustering assumption that enables cluster-specific optima, and defines a similarity graph with connectivity measures to guide regularization. The main contribution is a rigorous upper bound on the cluster-wise deviation of GTVMin solutions from their cluster averages, expressed in terms of the cluster boundary, the second Laplacian eigenvalue $\lambda_2({\bf L}^{(\mathcal{C})})$, the cluster loss bound $\varepsilon^{(\mathcal{C})}$, and an outside-cluster norm bound $R$, highlighting the role of graph structure and the regularization strength $\alpha$. This bound provides insight into the robustness and effectiveness of GTVMin for addressing heterogeneity and informs design choices for the similarity graph and regularization parameter in distributed optimization settings.
Abstract
A key challenge in federated learning applications is the statistical heterogeneity of local datasets. Clustered federated learning addresses this challenge by identifying clusters of local datasets that are approximately homogeneous. One recent approach to clustered federated learning is generalized total variation minimization (GTVMin). This approach requires a similarity graph which can be obtained by domain expertise or in a data-driven fashion via graph learning techniques. Under a widely applicable clustering assumption, we derive an upper bound the deviation between GTVMin solutions and their cluster-wise averages. This bound provides valuable insights into the effectiveness and robustness of GTVMin in addressing statistical heterogeneity within federated learning environments.
