HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

Gyudong Kim; Mehdi Ghasemi; Soroush Heidari; Seungryong Kim; Young Geun Kim; Sarma Vrudhula; Carole-Jean Wu

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

Gyudong Kim, Mehdi Ghasemi, Soroush Heidari, Seungryong Kim, Young Geun Kim, Sarma Vrudhula, Carole-Jean Wu

TL;DR

This work identifies system-induced data heterogeneity, arising from HW/SW device fragmentation, as a key but underexplored factor limiting federated learning performance. It introduces HeteroSwitch, a selective generalization framework that first measures per-client bias via loss dynamics and then applies ISP-based data diversification with random White Balance and Gamma, coupled with SWAD weight averaging, in a targeted, device-aware manner. Experiments across realistic FL datasets (including Flair) and synthetic benchmarks (CIFAR-100) show substantial reductions in cross-device variance and improvements in worst-case and average accuracy, outperforming baselines such as FedAvg, q-FedAvg, FedProx, and Scaffold. The results demonstrate the practical value of adaptive generalization to stabilize FL in heterogeneous device ecosystems and highlight implications for fairness and domain generalization in real-world deployments.

Abstract

Federated Learning (FL) is a practical approach to train deep learning models collaboratively across user-end devices, protecting user privacy by retaining raw data on-device. In FL, participating user-end devices are highly fragmented in terms of hardware and software configurations. Such fragmentation introduces a new type of data heterogeneity in FL, namely \textit{system-induced data heterogeneity}, as each device generates distinct data depending on its hardware and software configurations. In this paper, we first characterize the impact of system-induced data heterogeneity on FL model performance. We collect a dataset using heterogeneous devices with variations across vendors and performance tiers. By using this dataset, we demonstrate that \textit{system-induced data heterogeneity} negatively impacts accuracy, and deteriorates fairness and domain generalization problems in FL. To address these challenges, we propose HeteroSwitch, which adaptively adopts generalization techniques (i.e., ISP transformation and SWAD) depending on the level of bias caused by varying HW and SW configurations. In our evaluation with a realistic FL dataset (FLAIR), HeteroSwitch reduces the variance of averaged precision by 6.3\% across device types.

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

TL;DR

Abstract

Paper Structure (27 sections, 3 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 27 sections, 3 equations, 9 figures, 6 tables, 1 algorithm.

Introduction
Background
Federated Learning
ISP Pipeline
System-induced Data Heterogeneity in FL
Dataset Creation
Data Heterogeneity across Device type
Deeper Look at Heterogeneity: HW
Deeper Look at Heterogeneity: SW
Fairness and Domain Generalization Issues
Fairness
Domain Generalization
Proposed Design: HeteroSwitch
Bias Measurement
Generalization Techniques
...and 12 more sections

Figures (9)

Figure 1: End-to-end ISP pipeline for a vision DNN. Devices generate heterogeneous data due to the SW and HW variations, degrading model performance significantly.The accuracy of Homogeneous Client is obtained when FL devices are the same, whereas Heterogeneous Client is obtained with different device types.
Figure 2: Model quality degradation in model when deployed to various device types using RAW data.
Figure 3: Model quality degradation in model when tested with adjustment of ISP algorithms at every stage.
Figure 4: Model quality degradation over the highest accuracy achieved by dominant devices, Galaxy S9 & S6, showing the bias in the global model towards dominant devices. The participant devices used for training followed the ratio specified in Table \ref{['table:devices']}.
Figure 5: Model quality degradation when a device type is excluded from training, illustrating the complex relationship of DG and accuracy with various device types in FL.
...and 4 more figures

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

TL;DR

Abstract

HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)