Roughness-Informed Federated Learning

Mohammad Partohaghighi; Roummel Marcia; Bruce J. West; YangQuan Chen

Roughness-Informed Federated Learning

Mohammad Partohaghighi, Roummel Marcia, Bruce J. West, YangQuan Chen

TL;DR

The paper tackles client drift in non-IID federated learning by introducing RI-FedAvg, which adds an RI-based adaptive regularization term to local objectives to stabilize updates in rough loss landscapes. RI is computed via projections along random directions and normalized total variation, yielding a per-client regularization strength that scales with landscape complexity. The authors provide a convergence analysis for non-convex objectives and demonstrate empirically that RI-FedAvg outperforms FedAvg, FedProx, FedDyn, SCAFFOLD, and DP-FedAvg on MNIST, CIFAR-10, and CIFAR-100, achieving higher accuracy and faster convergence in heterogeneous settings. This work establishes a principled link between loss landscape analysis and federated optimization, offering a practical approach to robust FL in real-world, data-heterogeneous environments.

Abstract

Federated Learning (FL) enables collaborative model training across distributed clients while preserving data privacy, yet faces challenges in non-independent and identically distributed (non-IID) settings due to client drift, which impairs convergence. We propose RI-FedAvg, a novel FL algorithm that mitigates client drift by incorporating a Roughness Index (RI)-based regularization term into the local objective, adaptively penalizing updates based on the fluctuations of local loss landscapes. This paper introduces RI-FedAvg, leveraging the RI to quantify the roughness of high-dimensional loss functions, ensuring robust optimization in heterogeneous settings. We provide a rigorous convergence analysis for non-convex objectives, establishing that RI-FedAvg converges to a stationary point under standard assumptions. Extensive experiments on MNIST, CIFAR-10, and CIFAR-100 demonstrate that RI-FedAvg outperforms state-of-the-art baselines, including FedAvg, FedProx, FedDyn, SCAFFOLD, and DP-FedAvg, achieving higher accuracy and faster convergence in non-IID scenarios. Our results highlight RI-FedAvg's potential to enhance the robustness and efficiency of federated learning in practical, heterogeneous environments.

Roughness-Informed Federated Learning

TL;DR

Abstract

Paper Structure (22 sections, 4 theorems, 58 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 4 theorems, 58 equations, 9 figures, 6 tables, 1 algorithm.

Introduction
Motivation and Context
Challenges in Non-IID Federated Learning
Our Contributions
Paper Organization
Related Work
Federated Learning Algorithms
Loss Landscape Analysis
Positioning RI-FedAvg
Methodology
Preliminaries
RI-FedAvg Algorithm
Convergence Analysis
Definitions
Assumptions
...and 7 more sections

Key Result

Lemma 1

For client $k \in S_t$, performing one SGD step on $\tilde{F}_k(\mathbf{w})$ with learning rate $\eta \leq \frac{1}{L + 2 \lambda \mathcal{I}_k}$, starting from $\mathbf{w} = \mathbf{w}_{t,k}$, yields: where $\mathbf{w}' = \mathbf{w}_{t,k} - \eta g(\mathbf{w}_{t,k}; b)$, $g(\mathbf{w}_{t,k}; b) = \nabla \ell(\mathbf{w}_{t,k}; b) + 2 \lambda \mathcal{I}_k (\mathbf{w}_{t,k} - \mathbf{w}_t)$, and $L

Figures (9)

Figure 1: Test Accuracy Evolution over 50 Communication Rounds for Non-IID MNIST
Figure 2: Roughness Index Evolution over Communication Rounds on MNIST.
Figure 3: Test Accuracy of RI-FedAvg on Non-IID MNIST after 50 Rounds across Varying Parameters ($M$, $m$, $\lambda$, $\eta$).
Figure 4: Test Accuracy Evolution over 50 Communication Rounds on CIFAR-10.
Figure 5: Roughness Index Evolution over Communication Rounds on CIFAR-10.
...and 4 more figures

Theorems & Definitions (8)

Lemma 1: One-Step Local Update
proof
Lemma 2: Gradient Relation
proof
Lemma 3: Client Drift Bound
proof
Theorem 1: Convergence of RI-FedAvg
proof

Roughness-Informed Federated Learning

TL;DR

Abstract

Roughness-Informed Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (8)