Table of Contents
Fetching ...

Beyond Model Scale Limits: End-Edge-Cloud Federated Learning with Self-Rectified Knowledge Agglomeration

Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Ke Xu, Quyang Pan, Bo Gao, Tian Wen

TL;DR

FedEEC tackles the bottleneck of training large models in Hierarchical Federated Learning under End-Edge-Cloud Collaboration by enabling tier-aware model scaling and robust cross-tier knowledge transfer. It introduces Bridge Sample Based Online Distillation Protocol (BSBODP) to facilitate model-agnostic distillation via bridge samples and Self-Knowledge Rectification (SKR) to correct propagated probabilities, together supporting dynamic node migration. The framework provides theoretical insights on migration compatibility through equivalence vs partial-order interaction protocols and demonstrates superior cloud-model accuracy and faster convergence on SVHN, CIFAR-10, and CINIC-10 compared with strong baselines. Practically, FedEEC reduces communication overhead by exchanging lightweight logits/embeddings and offers deployment flexibility for real-world EECC environments with heterogeneous devices.

Abstract

The rise of End-Edge-Cloud Collaboration (EECC) offers a promising paradigm for Artificial Intelligence (AI) model training across end devices, edge servers, and cloud data centers, providing enhanced reliability and reduced latency. Hierarchical Federated Learning (HFL) can benefit from this paradigm by enabling multi-tier model aggregation across distributed computing nodes. However, the potential of HFL is significantly constrained by the inherent heterogeneity and dynamic characteristics of EECC environments. Specifically, the uniform model structure bounded by the least powerful end device across all computing nodes imposes a performance bottleneck. Meanwhile, coupled heterogeneity in data distributions and resource capabilities across tiers disrupts hierarchical knowledge transfer, leading to biased updates and degraded performance. Furthermore, the mobility and fluctuating connectivity of computing nodes in EECC environments introduce complexities in dynamic node migration, further compromising the robustness of the training process. To address multiple challenges within a unified framework, we propose End-Edge-Cloud Federated Learning with Self-Rectified Knowledge Agglomeration (FedEEC), which is a novel EECC-empowered FL framework that allows the trained models from end, edge, to cloud to grow larger in size and stronger in generalization ability. FedEEC introduces two key innovations: (1) Bridge Sample Based Online Distillation Protocol (BSBODP), which enables knowledge transfer between neighboring nodes through generated bridge samples, and (2) Self-Knowledge Rectification (SKR), which refines the transferred knowledge to prevent suboptimal cloud model optimization. The proposed framework effectively handles both cross-tier resource heterogeneity and effective knowledge transfer between neighboring nodes, while satisfying the migration-resilient requirements of EECC.

Beyond Model Scale Limits: End-Edge-Cloud Federated Learning with Self-Rectified Knowledge Agglomeration

TL;DR

FedEEC tackles the bottleneck of training large models in Hierarchical Federated Learning under End-Edge-Cloud Collaboration by enabling tier-aware model scaling and robust cross-tier knowledge transfer. It introduces Bridge Sample Based Online Distillation Protocol (BSBODP) to facilitate model-agnostic distillation via bridge samples and Self-Knowledge Rectification (SKR) to correct propagated probabilities, together supporting dynamic node migration. The framework provides theoretical insights on migration compatibility through equivalence vs partial-order interaction protocols and demonstrates superior cloud-model accuracy and faster convergence on SVHN, CIFAR-10, and CINIC-10 compared with strong baselines. Practically, FedEEC reduces communication overhead by exchanging lightweight logits/embeddings and offers deployment flexibility for real-world EECC environments with heterogeneous devices.

Abstract

The rise of End-Edge-Cloud Collaboration (EECC) offers a promising paradigm for Artificial Intelligence (AI) model training across end devices, edge servers, and cloud data centers, providing enhanced reliability and reduced latency. Hierarchical Federated Learning (HFL) can benefit from this paradigm by enabling multi-tier model aggregation across distributed computing nodes. However, the potential of HFL is significantly constrained by the inherent heterogeneity and dynamic characteristics of EECC environments. Specifically, the uniform model structure bounded by the least powerful end device across all computing nodes imposes a performance bottleneck. Meanwhile, coupled heterogeneity in data distributions and resource capabilities across tiers disrupts hierarchical knowledge transfer, leading to biased updates and degraded performance. Furthermore, the mobility and fluctuating connectivity of computing nodes in EECC environments introduce complexities in dynamic node migration, further compromising the robustness of the training process. To address multiple challenges within a unified framework, we propose End-Edge-Cloud Federated Learning with Self-Rectified Knowledge Agglomeration (FedEEC), which is a novel EECC-empowered FL framework that allows the trained models from end, edge, to cloud to grow larger in size and stronger in generalization ability. FedEEC introduces two key innovations: (1) Bridge Sample Based Online Distillation Protocol (BSBODP), which enables knowledge transfer between neighboring nodes through generated bridge samples, and (2) Self-Knowledge Rectification (SKR), which refines the transferred knowledge to prevent suboptimal cloud model optimization. The proposed framework effectively handles both cross-tier resource heterogeneity and effective knowledge transfer between neighboring nodes, while satisfying the migration-resilient requirements of EECC.
Paper Structure (47 sections, 49 equations, 6 figures, 7 tables, 3 algorithms)

This paper contains 47 sections, 49 equations, 6 figures, 7 tables, 3 algorithms.

Figures (6)

  • Figure 1: Hierarchical topology of end-edge-cloud Collaboration with dynamic node migration.
  • Figure 2: Expected model scales distribution across tiers in an end-edge-cloud network.
  • Figure 3: Overview of BSBODP. (1) Child node distillation on bridge samples. (2) Logits extraction on child node. (3) Upload logits to parent node. (4) Parent node distillation on bridge samples. (5) Logits extraction on parent node. (6) Distribute logits to child node.
  • Figure 4: Comparison of private samples and bridge samples.
  • Figure 5: Learning curves illustrating cloud model accuracy (%) across communication rounds.
  • ...and 1 more figures