Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

Dun Zeng; Zenglin Xu; Yu Pan; Xu Luo; Qifan Wang; Xiaoying Tang

Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

Dun Zeng, Zenglin Xu, Yu Pan, Xu Luo, Qifan Wang, Xiaoying Tang

TL;DR

This work presents the first adaptive client sampler, K-Vib, employing an independent sampling procedure, and indicates that K-Vib doubles the speed compared to baseline algorithms, demonstrating significant potential in federated optimization.

Abstract

Federated Learning (FL) is a distributed learning paradigm to train a global model across multiple devices without collecting local data. In FL, a server typically selects a subset of clients for each training round to optimize resource usage. Central to this process is the technique of unbiased client sampling, which ensures a representative selection of clients. Current methods primarily utilize a random sampling procedure which, despite its effectiveness, achieves suboptimal efficiency owing to the loose upper bound caused by the sampling variance. In this work, by adopting an independent sampling procedure, we propose a federated optimization framework focused on adaptive unbiased client sampling, improving the convergence rate via an online variance reduction strategy. In particular, we present the first adaptive client sampler, K-Vib, employing an independent sampling procedure. K-Vib achieves a linear speed-up on the regret bound $\tilde{\mathcal{O}}\big(N^{\frac{1}{3}}T^{\frac{2}{3}}/K^{\frac{4}{3}}\big)$ within a set communication budget $K$. Empirical studies indicate that K-Vib doubles the speed compared to baseline algorithms, demonstrating significant potential in federated optimization.

Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

TL;DR

Abstract

within a set communication budget

. Empirical studies indicate that K-Vib doubles the speed compared to baseline algorithms, demonstrating significant potential in federated optimization.

Paper Structure (44 sections, 19 theorems, 100 equations, 7 figures, 2 algorithms)

This paper contains 44 sections, 19 theorems, 100 equations, 7 figures, 2 algorithms.

Introduction
Contributions
Preliminaries
Optimal unbiased client sampling
Case Study on Sampling Procedure
RSP is a special case of ISP
ISP estimates are asymptotic to full participation results
ISP creates expected sampling size
General Convergence Analyses of FL with Unbiased Client Sampling
Remark
Interpretation of Theorem \ref{['theorem:convergence']}
Sampling utility
Theories of the K-Vib Sampler
Adaptive Client Sampling as Online Optimization
What does regret measure?
...and 29 more sections

Key Result

Lemma 2.1

For any communication round $t \in [T]$ in FL, random sampling yielding the $\mathbf{P}_{ij}^t = \text{Prob}(i,j\in S^t) = K(K-1)/N(N-1)$, and independent sampling yielding $\mathbf{P}_{ij}^t = \text{Prob}(i,j\in S^t) = \boldsymbol{p}_i^t \boldsymbol{p}_j^t$, they admit

Figures (7)

Figure 1: The variance of ISP estimates is lower than RSP. Global estimates on the X-Y plane. (a) Scatter plot of estimates errors, where "uniform" indicates the RSP with uniform probability. (b) The notations RSP($\boldsymbol{g}_i, \boldsymbol{g}_j$) and ISP($\boldsymbol{g}_i,\boldsymbol{g}_j$) represent the global estimates constructed through random sampling and independent sampling, respectively, using sampled vectors $\boldsymbol{g}_i$ and $\boldsymbol{g}_j$. Global indicates the full participation results. We can see ISP($\boldsymbol{g}_i,\boldsymbol{g}_j$) is closer to Global.
Figure 2: Evaluation on dynamic regret equation \ref{['eq:regret']}, gradient variance equation \ref{['eq:variance']}, and test loss.
Figure 3: Data distribution of synthetic dataset and sensitivity study on $\gamma$.
Figure 4: Federated EMNIST dataset experiments.
Figure 5: Federated text dataset experiments.
...and 2 more figures

Theorems & Definitions (34)

Remark 2.1: Constraints on sampling probability
Definition 2.1: Unbiasedness of client sampling $S^t$
Lemma 2.1: Optimal sampling procedure, horvath2019nonconvex
Lemma 2.2: Optimal sampling probability, chen2020optimal
Example 3.1
Example 3.2
Definition 4.1: Sampling quality
Theorem 4.1: FedAvg with arbitrary unbiased client sampling
Theorem 5.1: Bound of best fixed probability
Lemma 5.1: Solution to equation \ref{['obj:ol_ftrl']}
...and 24 more

Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

TL;DR

Abstract

Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (34)