Table of Contents
Fetching ...

Analyzing the Impact of Participant Failures in Cross-Silo Federated Learning

Fabian Stricker, David Bermbach, Christian Zirpins

TL;DR

This study investigates how participant failures affect model quality in cross-silo federated learning with a small number of organizations. It performs a comprehensive experimental analysis across data skew, data complexity, timing of dropout, and evaluation settings, using RoundSVEstimation to assess participant contributions. Experiments on CIFAR-10 and CIFAR-100 under non-IID Dirichlet partitions reveal that evaluation can become optimistic when dropouts occur, that high-value participants can be identified via contribution metrics, and that dropout timing significantly alters learning dynamics. The findings offer practical guidance for designing robust failure-handling strategies in cross-silo FL and for researchers and software architects building reliable inter-organizational FL systems.

Abstract

Federated learning (FL) is a new paradigm for training machine learning (ML) models without sharing data. While applying FL in cross-silo scenarios, where organizations collaborate, it is necessary that the FL system is reliable; however, participants can fail due to various reasons (e.g., communication issues or misconfigurations). In order to provide a reliable system, it is necessary to analyze the impact of participant failures. While this problem received attention in cross-device FL where mobile devices with limited resources participate, there is comparatively little research in cross-silo FL. Therefore, we conduct an extensive study for analyzing the impact of participant failures on the model quality in the context of inter-organizational cross-silo FL with few participants. In our study, we focus on analyzing generally influential factors such as the impact of the timing and the data as well as the impact on the evaluation, which is important for deciding, if the model should be deployed. We show that under high skews the evaluation is optimistic and hides the real impact. Furthermore, we demonstrate that the timing impacts the quality of the trained model. Our results offer insights for researchers and software architects aiming to build robust FL systems.

Analyzing the Impact of Participant Failures in Cross-Silo Federated Learning

TL;DR

This study investigates how participant failures affect model quality in cross-silo federated learning with a small number of organizations. It performs a comprehensive experimental analysis across data skew, data complexity, timing of dropout, and evaluation settings, using RoundSVEstimation to assess participant contributions. Experiments on CIFAR-10 and CIFAR-100 under non-IID Dirichlet partitions reveal that evaluation can become optimistic when dropouts occur, that high-value participants can be identified via contribution metrics, and that dropout timing significantly alters learning dynamics. The findings offer practical guidance for designing robust failure-handling strategies in cross-silo FL and for researchers and software architects building reliable inter-organizational FL systems.

Abstract

Federated learning (FL) is a new paradigm for training machine learning (ML) models without sharing data. While applying FL in cross-silo scenarios, where organizations collaborate, it is necessary that the FL system is reliable; however, participants can fail due to various reasons (e.g., communication issues or misconfigurations). In order to provide a reliable system, it is necessary to analyze the impact of participant failures. While this problem received attention in cross-device FL where mobile devices with limited resources participate, there is comparatively little research in cross-silo FL. Therefore, we conduct an extensive study for analyzing the impact of participant failures on the model quality in the context of inter-organizational cross-silo FL with few participants. In our study, we focus on analyzing generally influential factors such as the impact of the timing and the data as well as the impact on the evaluation, which is important for deciding, if the model should be deployed. We show that under high skews the evaluation is optimistic and hides the real impact. Furthermore, we demonstrate that the timing impacts the quality of the trained model. Our results offer insights for researchers and software architects aiming to build robust FL systems.

Paper Structure

This paper contains 20 sections, 8 figures, 8 tables.

Figures (8)

  • Figure 1: CIFAR-10: Comparison of the evaluation with and without the failed participant across different skews.
  • Figure 2: CIFAR-100: Comparison of the evaluation with and without the failed participant across different skews.
  • Figure 3: CIFAR-10: Impact of different participants dropping out on the performance across different skews.
  • Figure 4: CIFAR-100: Impact of different participants dropping out on the performance.
  • Figure 5: CIFAR-10: Comparison of the global model's performance for the failed participant against the performance for the other participants.
  • ...and 3 more figures