Table of Contents
Fetching ...

FedVeca: Federated Vectorized Averaging on Non-IID Data with Adaptive Bi-directional Global Objective

Ping Luo, Jieren Cheng, Zhenhao Liu, N. Xiong, Jie Wu

TL;DR

This work tackles Federated Learning with Non-IID data and heterogeneous client workloads by introducing FedVeca, which defines local gradients as bi-directional vectors with adaptive step sizes $\tau_{(k,i)}$. The authors derive upper bounds on per-round updates and design a server–client algorithm that adaptively adjusts $\tau_{(k,i)}$ to steer the global model toward an optimum, supported by convergence analysis using $A_{(k,i)}=\eta \beta_{(k,i)}^2 \delta_{(k,i)}$ and $\alpha_k$. Empirical results on MNIST and CIFAR-10 with a five-node prototype show FedVeca achieving faster convergence and competitive final accuracy compared to FedAvg, FedNova, FedProx, and SCAFFOLD, especially under Non-IID distributions. The approach advances efficient FL on heterogeneous devices and provides a principled framework for adaptive local iteration control in distributed learning settings.

Abstract

Federated Learning (FL) is a distributed machine learning framework to alleviate the data silos, where decentralized clients collaboratively learn a global model without sharing their private data. However, the clients' Non-Independent and Identically Distributed (Non-IID) data negatively affect the trained model, and clients with different numbers of local updates may cause significant gaps to the local gradients in each communication round. In this paper, we propose a Federated Vectorized Averaging (FedVeca) method to address the above problem on Non-IID data. Specifically, we set a novel objective for the global model which is related to the local gradients. The local gradient is defined as a bi-directional vector with step size and direction, where the step size is the number of local updates and the direction is divided into positive and negative according to our definition. In FedVeca, the direction is influenced by the step size, thus we average the bi-directional vectors to reduce the effect of different step sizes. Then, we theoretically analyze the relationship between the step sizes and the global objective, and obtain upper bounds on the step sizes per communication round. Based on the upper bounds, we design an algorithm for the server and the client to adaptively adjusts the step sizes that make the objective close to the optimum. Finally, we conduct experiments on different datasets, models and scenarios by building a prototype system, and the experimental results demonstrate the effectiveness and efficiency of the FedVeca method.

FedVeca: Federated Vectorized Averaging on Non-IID Data with Adaptive Bi-directional Global Objective

TL;DR

This work tackles Federated Learning with Non-IID data and heterogeneous client workloads by introducing FedVeca, which defines local gradients as bi-directional vectors with adaptive step sizes . The authors derive upper bounds on per-round updates and design a server–client algorithm that adaptively adjusts to steer the global model toward an optimum, supported by convergence analysis using and . Empirical results on MNIST and CIFAR-10 with a five-node prototype show FedVeca achieving faster convergence and competitive final accuracy compared to FedAvg, FedNova, FedProx, and SCAFFOLD, especially under Non-IID distributions. The approach advances efficient FL on heterogeneous devices and provides a principled framework for adaptive local iteration control in distributed learning settings.

Abstract

Federated Learning (FL) is a distributed machine learning framework to alleviate the data silos, where decentralized clients collaboratively learn a global model without sharing their private data. However, the clients' Non-Independent and Identically Distributed (Non-IID) data negatively affect the trained model, and clients with different numbers of local updates may cause significant gaps to the local gradients in each communication round. In this paper, we propose a Federated Vectorized Averaging (FedVeca) method to address the above problem on Non-IID data. Specifically, we set a novel objective for the global model which is related to the local gradients. The local gradient is defined as a bi-directional vector with step size and direction, where the step size is the number of local updates and the direction is divided into positive and negative according to our definition. In FedVeca, the direction is influenced by the step size, thus we average the bi-directional vectors to reduce the effect of different step sizes. Then, we theoretically analyze the relationship between the step sizes and the global objective, and obtain upper bounds on the step sizes per communication round. Based on the upper bounds, we design an algorithm for the server and the client to adaptively adjusts the step sizes that make the objective close to the optimum. Finally, we conduct experiments on different datasets, models and scenarios by building a prototype system, and the experimental results demonstrate the effectiveness and efficiency of the FedVeca method.
Paper Structure (33 sections, 2 theorems, 33 equations, 12 figures, 1 table, 2 algorithms)

This paper contains 33 sections, 2 theorems, 33 equations, 12 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

In the $k$-th round for $k \in [0,k-1]$, when $\eta \tau_{k} L -1 \geq 0$, we have where $A_{(k,i)} \triangleq \eta \beta_{(k,i)}^{2} \delta_{(k,i)}$ is a variable that varies with $\delta_{(k,i)}$ and $\beta_{(k,i)}^{2}$.

Figures (12)

  • Figure 1: A typical FL communication networks system.
  • Figure 2: The overall architecture of proposed Federated Vectorized Averaging (FedVeca) method on Non-IID data.
  • Figure 3: Generalized update rules in the $k$-th round.
  • Figure 4: Illustrative diagram of the learning problem in the $k$-th round.
  • Figure 5: Loss function values and classification accuracy values in Case 3
  • ...and 7 more figures

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Theorem 2
  • proof