Table of Contents
Fetching ...

FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator

Sunny Gupta, Nikita Jangid, Amit Sethi

TL;DR

FedStein uniquely shares only the James-Stein (JS) estimates of batch normalization statistics across clients, while maintaining local BN parameters through the James-Stein Estimator, which surpasses existing methods such as FedAvg and FedBN.

Abstract

Federated Learning (FL) facilitates data privacy by enabling collaborative in-situ training across decentralized clients. Despite its inherent advantages, FL faces significant challenges of performance and convergence when dealing with data that is not independently and identically distributed (non-i.i.d.). While previous research has primarily addressed the issue of skewed label distribution across clients, this study focuses on the less explored challenge of multi-domain FL, where client data originates from distinct domains with varying feature distributions. We introduce a novel method designed to address these challenges FedStein: Enhancing Multi-Domain Federated Learning Through the James-Stein Estimator. FedStein uniquely shares only the James-Stein (JS) estimates of batch normalization (BN) statistics across clients, while maintaining local BN parameters. The non-BN layer parameters are exchanged via standard FL techniques. Extensive experiments conducted across three datasets and multiple models demonstrate that FedStein surpasses existing methods such as FedAvg and FedBN, with accuracy improvements exceeding 14% in certain domains leading to enhanced domain generalization. The code is available at https://github.com/sunnyinAI/FedStein

FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator

TL;DR

FedStein uniquely shares only the James-Stein (JS) estimates of batch normalization statistics across clients, while maintaining local BN parameters through the James-Stein Estimator, which surpasses existing methods such as FedAvg and FedBN.

Abstract

Federated Learning (FL) facilitates data privacy by enabling collaborative in-situ training across decentralized clients. Despite its inherent advantages, FL faces significant challenges of performance and convergence when dealing with data that is not independently and identically distributed (non-i.i.d.). While previous research has primarily addressed the issue of skewed label distribution across clients, this study focuses on the less explored challenge of multi-domain FL, where client data originates from distinct domains with varying feature distributions. We introduce a novel method designed to address these challenges FedStein: Enhancing Multi-Domain Federated Learning Through the James-Stein Estimator. FedStein uniquely shares only the James-Stein (JS) estimates of batch normalization (BN) statistics across clients, while maintaining local BN parameters. The non-BN layer parameters are exchanged via standard FL techniques. Extensive experiments conducted across three datasets and multiple models demonstrate that FedStein surpasses existing methods such as FedAvg and FedBN, with accuracy improvements exceeding 14% in certain domains leading to enhanced domain generalization. The code is available at https://github.com/sunnyinAI/FedStein
Paper Structure (14 sections, 9 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 9 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: We focus on multi-domain federated learning, where each client possesses data from a specific domain. This framework is highly relevant and useful in real-world applications. For example, autonomous vehicles in different regions gather images under various weather conditions.
  • Figure 2: An illustration of two different approaches to multi-centre training of BN layers. This description follows the definitions provided in Eqn 3. Computation flows from left to right. (a) In FedAvg, both BN parameters and BN statistics are aggregated into one server, and (b) in FedStein, BN parameters are removed, and the JS norms of the BN statistics are aggregated into the server. Non-BN layers are shared in both methods.