AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He; Kai Tong; Di Fang; Han Sun; Haoran Li; Tianyi Chen; Ziqian Zeng; Huiping Zhuang

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang, Han Sun, Haoran Li, Tianyi Chen, Ziqian Zeng, Huiping Zhuang

TL;DR

AFL presents a gradient-free federated learning framework that uses a pre-trained backbone to convert client training into a closed-form linear regression solved in one epoch, paired with an Absolute Aggregation law that enables single-round, optimal aggregation across clients. The Regularization Intermediary (RI) handles rank-deficient cases, preserving the AA-law's exact equivalence to centralized training. The approach exhibits invariance to data partitioning and client count, delivers fast convergence, and achieves competitive accuracy across extremely non-IID settings and large client populations, with substantial communication and computation savings. The method is validated on CIFAR-10/100 and Tiny-ImageNet with multiple backbones, and the authors provide public code for reproducibility.

Abstract

In this paper, we introduce analytic federated learning (AFL), a new training paradigm that brings analytical (i.e., closed-form) solutions to the federated learning (FL) with pre-trained models. Our AFL draws inspiration from analytic learning -- a gradient-free technique that trains neural networks with analytical solutions in one epoch. In the local client training stage, the AFL facilitates a one-epoch training, eliminating the necessity for multi-epoch updates. In the aggregation stage, we derive an absolute aggregation (AA) law. This AA law allows a single-round aggregation, reducing heavy communication overhead and achieving fast convergence by removing the need for multiple aggregation rounds. More importantly, the AFL exhibits a property that $\textit{invariance to data partitioning}$, meaning that regardless of how the full dataset is distributed among clients, the aggregated result remains identical. This could spawn various potentials, such as data heterogeneity invariance and client-number invariance. We conduct experiments across various FL settings including extremely non-IID ones, and scenarios with a large number of clients (e.g., $\ge 1000$). In all these settings, our AFL constantly performs competitively while existing FL techniques encounter various obstacles. Our codes are available at https://github.com/ZHUANGHP/Analytic-federated-learning.

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

TL;DR

Abstract

, meaning that regardless of how the full dataset is distributed among clients, the aggregated result remains identical. This could spawn various potentials, such as data heterogeneity invariance and client-number invariance. We conduct experiments across various FL settings including extremely non-IID ones, and scenarios with a large number of clients (e.g.,

). In all these settings, our AFL constantly performs competitively while existing FL techniques encounter various obstacles. Our codes are available at https://github.com/ZHUANGHP/Analytic-federated-learning.

Paper Structure (23 sections, 3 theorems, 70 equations, 3 figures, 7 tables, 1 algorithm)

This paper contains 23 sections, 3 theorems, 70 equations, 3 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Federated Learning Methods
Analytic Learning
Analytic Federated Learning
Local Stage: Localized Analytic Learning
Aggregation Stage: Absolute Aggregation Law
RI Process: AA Law in Rank-deficient Scenario
Experiments
Comparison with FL Techniques
Analysis on Data Partition
Training Efficiency
Ablation Study of RI Process
Validation with Different Backbones
Limitations and Future Work
...and 8 more sections

Key Result

Lemma 1

Let $\bm{X} = $ with $\bm{X}_{u}$ and $\bm{X}_{v}$ having full column ranks, and $\bm{X}$ follows a partition where

Figures (3)

Figure 1: An overview of the AFL. During the local stage, each client calculates $\bm{C}_{k}^{\text{r}}$ and $\bm{\hat{W}}_{k}^{\text{r}}$ based on the same pre-trained backbone and its own dataset. The server obtained the $\bm{C}_{\text{agg},K}^{\text{r}}$ and $\bm{\hat{W}}_{\text{agg},K}^{\text{r}}$ then get $\bm{\hat{W}}$ in the aggregation stage.
Figure 2: Accuracy over various number of clients.
Figure 3: Accuracy curves with communication rounds. Average training time is reported in the legends.

Theorems & Definitions (9)

Lemma 1
proof
Theorem 1
proof
Theorem 2
proof
proof
proof
proof

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

TL;DR

Abstract

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (9)