Table of Contents
Fetching ...

FTTE: Federated Learning on Resource-Constrained Devices

Irene Tenison, Anna Murphy, Charles Beauville, Lalana Kagal

TL;DR

FTTE tackles the challenge of deploying federated learning on resource-constrained edge devices by introducing memory-aware parameter selection and sparse semi-asynchronous aggregation governed by an age- and variance-weighted staleness function. The server computes a global sparse parameter subset $w^*$ under a memory budget $M_{min}$ and aggregates only sparse updates via a buffer, mitigating straggler bias. Empirically, FTTE delivers 81% faster convergence, 80% on-device memory reduction, and 69% payload reduction on CIFAR-10 while maintaining or surpassing the accuracy of semi-asynchronous baselines, and scales to 500 clients with up to 90% stragglers. This work demonstrates a practical, scalable solution for real-world FL on heterogeneous, highly resource-constrained edge networks and suggests avenues for integration with quantization and privacy-preserving techniques.

Abstract

Federated learning (FL) enables collaborative model training across distributed devices while preserving data privacy, but deployment on resource-constrained edge nodes remains challenging due to limited memory, energy, and communication bandwidth. Traditional synchronous and asynchronous FL approaches further suffer from straggler induced delays and slow convergence in heterogeneous, large scale networks. We present FTTE (Federated Tiny Training Engine),a novel semi-asynchronous FL framework that uniquely employs sparse parameter updates and a staleness-weighted aggregation based on both age and variance of client updates. Extensive experiments across diverse models and data distributions - including up to 500 clients and 90% stragglers - demonstrate that FTTE not only achieves 81% faster convergence, 80% lower on-device memory usage, and 69% communication payload reduction than synchronous FL (eg.FedAVG), but also consistently reaches comparable or higher target accuracy than semi-asynchronous (eg.FedBuff) in challenging regimes. These results establish FTTE as the first practical and scalable solution for real-world FL deployments on heterogeneous and predominantly resource-constrained edge devices.

FTTE: Federated Learning on Resource-Constrained Devices

TL;DR

FTTE tackles the challenge of deploying federated learning on resource-constrained edge devices by introducing memory-aware parameter selection and sparse semi-asynchronous aggregation governed by an age- and variance-weighted staleness function. The server computes a global sparse parameter subset under a memory budget and aggregates only sparse updates via a buffer, mitigating straggler bias. Empirically, FTTE delivers 81% faster convergence, 80% on-device memory reduction, and 69% payload reduction on CIFAR-10 while maintaining or surpassing the accuracy of semi-asynchronous baselines, and scales to 500 clients with up to 90% stragglers. This work demonstrates a practical, scalable solution for real-world FL on heterogeneous, highly resource-constrained edge networks and suggests avenues for integration with quantization and privacy-preserving techniques.

Abstract

Federated learning (FL) enables collaborative model training across distributed devices while preserving data privacy, but deployment on resource-constrained edge nodes remains challenging due to limited memory, energy, and communication bandwidth. Traditional synchronous and asynchronous FL approaches further suffer from straggler induced delays and slow convergence in heterogeneous, large scale networks. We present FTTE (Federated Tiny Training Engine),a novel semi-asynchronous FL framework that uniquely employs sparse parameter updates and a staleness-weighted aggregation based on both age and variance of client updates. Extensive experiments across diverse models and data distributions - including up to 500 clients and 90% stragglers - demonstrate that FTTE not only achieves 81% faster convergence, 80% lower on-device memory usage, and 69% communication payload reduction than synchronous FL (eg.FedAVG), but also consistently reaches comparable or higher target accuracy than semi-asynchronous (eg.FedBuff) in challenging regimes. These results establish FTTE as the first practical and scalable solution for real-world FL deployments on heterogeneous and predominantly resource-constrained edge devices.

Paper Structure

This paper contains 8 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: FTTE on average (a) converges 81% faster (b) consumes 80% less on-device memory and (c) requires 69% less payload on CIFAR-10 in comparison to FedAVG or SyncFL.
  • Figure 2: Illustration of FTTE - a FL method for resource-constrained federated systems with significantly faster convergence (communication rounds) and improved resource efficiency (on-device memory and payload size)
  • Figure 3: (a) On-device memory and (b) payload size requirements for FTTE with full updates as in classic FL, last layer update as in TL, and sparse update (ours).
  • Figure 4: (a) Convergence accuracy of sparse updates versus last layer update scheme. (b) FTTE is a scalable system with upto 500 devices with randmly chosen 50% stragglers.
  • Figure 5: Shows communication steps required for FedAVG and FTTE under (a) increasing percentage of stragglers and (b) increasing delay per straggling client in the federated network.