FedEGG: Federated Learning with Explicit Global Guidance

Kun Zhai; Yifeng Gao; Difan Zou; Guangnan Ye; Siheng Chen; Xingjun Ma; Yu-Gang Jiang

FedEGG: Federated Learning with Explicit Global Guidance

Kun Zhai, Yifeng Gao, Difan Zou, Guangnan Ye, Siheng Chen, Xingjun Ma, Yu-Gang Jiang

TL;DR

FedEGG introduces an explicit server-side global guiding task for federated learning, constructed from a public dataset and LLMs, and couples it with the standard FL objective in a two-phase, dual-optimization framework. The method includes a convergence-aware guiding strength controlled by a log-loss ratio constraint and a threshold $\tau$ derived from the cosine similarity between guiding and FL data representations. Theoretical analysis provides an upper bound on guiding strength and shows potential convergence acceleration when the guiding task sufficiently aids the FL task, with termination conditions tied to data heterogeneity $\Gamma_g$ and alignment measure $\Pi$. Empirically, FedEGG improves over state-of-the-art FL methods across IID and non-IID settings, particularly under high heterogeneity, and can enhance existing FL methods when used in combination, demonstrating practical impact for real-world privacy-preserving learning.

Abstract

Federated Learning (FL) holds great potential for diverse applications owing to its privacy-preserving nature. However, its convergence is often challenged by non-IID data distributions, limiting its effectiveness in real-world deployments. Existing methods help address these challenges via optimization-based client constraints, adaptive client selection, or the use of pre-trained models or synthetic data. In this work, we reinterpret these approaches as all introducing an \emph{implicit guiding task} to regularize and steer client learning. Following this insight, we propose to introduce an \emph{explicit global guiding task} into the current FL framework to improve convergence and performance. To this end, we present \textbf{FedEGG}, a new FL algorithm that constructs a global guiding task using a well-defined, easy-to-converge learning task based on a public dataset and Large Language Models (LLMs). This approach effectively combines the strengths of federated (the original FL task) and centralized (the global guiding task) learning. We provide a theoretical analysis of FedEGG's convergence, examining the impact of data heterogeneity between the guiding and FL tasks and the guiding strength. Our analysis derives an upper bound for the optimal guiding strength, offering practical insights for implementation. Empirically, FedEGG demonstrates superior performance over state-of-the-art FL methods under both IID and non-IID settings, and further improves their performances when combined.

FedEGG: Federated Learning with Explicit Global Guidance

TL;DR

derived from the cosine similarity between guiding and FL data representations. Theoretical analysis provides an upper bound on guiding strength and shows potential convergence acceleration when the guiding task sufficiently aids the FL task, with termination conditions tied to data heterogeneity

and alignment measure

. Empirically, FedEGG improves over state-of-the-art FL methods across IID and non-IID settings, particularly under high heterogeneity, and can enhance existing FL methods when used in combination, demonstrating practical impact for real-world privacy-preserving learning.

Abstract

Paper Structure (25 sections, 1 theorem, 23 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 1 theorem, 23 equations, 9 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Proposed Method
Problem Definition
FedEGG
Overview.
Algorithm Description.
Guiding Task.
Guiding Strength.
Convergence Analysis
Convergence Result.
Remarks.
Experiments
Experimental Setup
Main Results
...and 10 more sections

Key Result

Theorem 3.7

Assume Assumptions 1-4 hold and let $\eta_t \leq \frac{1}{4L}$. Then, we have: where $\epsilon = \gamma_t^2\sigma_g^2 + 2\left(\frac{1}{\mu} - 2\gamma_t(1 - L\gamma_t)\right) \Gamma_g + 4\gamma_t(1 - L\gamma_t) \Pi,$ with $\Gamma_g = f^* - F^*$ and $\Pi = f^* - F(\overline{\mathbf{W}}_t)$.

Figures (9)

Figure 1: An overview of FedEGG. It operates in two phases: Phase 1: Clients upload their data labels to the server, which are then used by a LLM to construct the guiding task. During each communication round, the server sends the averaged features of the guiding task to the clients, which are used to determine the guiding strength $\tau$. Phase 2: The FL task and guiding task are jointly optimized on the same model, with the guiding task providing explicit guidance to the FL task.
Figure 2: The training losses of different FL methods across 300 rounds on CIFAR-10 under the non-IID setting.
Figure 3: Optimization trajectories of FedAvg and FedEGG in the loss landscape on CIFAR-10 (non-IID).
Figure 4: Training losses and test accuracy of FedAvg and FedEGG on CIFAR-10 and CIFAR-100 under the non-IID setting, with synthetic data generated by GPT-4 used as the guiding task for FedEGG.
Figure 5: (a): Test accuracy (%) of FedEGG under different $\tau$ with LH, MH, and HH guiding tasks on CIFAR-10 (non-IID). Triangles denote test accuracy, curves show approximated trends, and asterisks mark $\tau$ calculated in FedEGG. (b): Test accuracy of FedEGG (red bars) and pre-training (blue bars) on CIFAR-10 under IID and non-IID settings. (c): Test accuracy of FedEGG and FedAvg under varying levels of FL data heterogeneity $\alpha$ on CIFAR-10.
...and 4 more figures

Theorems & Definitions (6)

Definition 3.5
Definition 3.6
Theorem 3.7
Remark 3.8
Remark 3.9
proof

FedEGG: Federated Learning with Explicit Global Guidance

TL;DR

Abstract

FedEGG: Federated Learning with Explicit Global Guidance

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (6)