Federated Learning over 5G, WiFi, and Ethernet: Measurements and Evaluation
Robert J. Hayek, Joaquin Chung, Kayla Comer, Chandra R. Murthy, Rajkumar Kettimuthu, Igor Kadota
TL;DR
This work presents a measurement-focused Federated Learning deployment over a real 5G-NR SA testbed, comparing FL performance when edge devices communicate via 5G, WiFi, and Ethernet. Using the Flower framework with a SqueezeNet CIFAR-10 setup on Raspberry Pis, the authors instrument ML and network metrics (including uplink/downlink times and aggregation delays) to analyze convergence and the straggler effect. Key findings show that uplink latency on 5G significantly prolongs a communication round, contributing roughly 23% of round duration and amplifying convergence time and stragglers compared to Ethernet and WiFi. The study provides actionable insights into the feasibility and limitations of real-world FL over heterogeneous networks and makes all software and data openly available for reproducibility and further research.
Abstract
Federated Learning (FL) deployments using IoT devices is an area that is poised to significantly benefit from advances in NextG wireless. In this paper, we deploy a FL application using a 5G-NR Standalone (SA) testbed with open-source and Commercial Off-the-Shelf (COTS) components. The 5G testbed architecture consists of a network of resource-constrained edge devices, namely Raspberry Pi's, and a central server equipped with a Software Defined Radio (SDR) and running O-RAN software. Our testbed allows edge devices to communicate with the server using WiFi and Ethernet, instead of 5G. FL is deployed using the Flower FL framework, for which we developed a comprehensive instrumentation tool to collect and analyze diverse communications and machine learning performance metrics including: model aggregation time, downlink transmission time, training time, and uplink transmission time. Leveraging these measurements, we perform a comparative analysis of the FL application across three network interfaces: 5G, WiFi, and Ethernet. Our experimental results suggest that, on 5G, the uplink model transfer time is a significant factor in convergence time of FL. In particular, we find that the 5G uplink contributes to roughly 23% of the duration of one average communication round when using all edge devices in our testbed. When comparing the uplink time of the 5G testbed, we find that it is 33.3x higher than Ethernet and 17.8x higher than WiFi. Our results also suggest that 5G exacerbates the well-known straggler effect. For reproducibility, we have open-sourced our FL application, instrumentation tools, and testbed configuration.
