Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning
Hongyao Chen, Tianyang Xu, Xiaojun Wu, Josef Kittler
TL;DR
Federated learning with non-IID data undermines Batch Normalisation due to biased global statistics. The authors introduce Hybrid Batch Normalisation (HBN), which decouples the update of BN statistics from learnable parameters and uses a learnable per-channel α to blend local batch statistics with unbiased global statistics computed at the server. Key contributions include an asynchronous, unbiased global-statistics aggregation, a per-client HBN normalisation with α, and extensive experiments showing robustness to heterogeneity and small batch sizes across common FL architectures. HBN acts as a practical plug-in that improves FL performance with modest communication and computation overhead, applicable across diverse networks and datasets.
Abstract
Batch Normalisation (BN) is widely used in conventional deep neural network training to harmonise the input-output distributions for each batch of data. However, federated learning, a distributed learning paradigm, faces the challenge of dealing with non-independent and identically distributed data among the client nodes. Due to the lack of a coherent methodology for updating BN statistical parameters, standard BN degrades the federated learning performance. To this end, it is urgent to explore an alternative normalisation solution for federated learning. In this work, we resolve the dilemma of the BN layer in federated learning by developing a customised normalisation approach, Hybrid Batch Normalisation (HBN). HBN separates the update of statistical parameters (i.e. , means and variances used for evaluation) from that of learnable parameters (i.e. , parameters that require gradient updates), obtaining unbiased estimates of global statistical parameters in distributed scenarios. In contrast with the existing solutions, we emphasise the supportive power of global statistics for federated learning. The HBN layer introduces a learnable hybrid distribution factor, allowing each computing node to adaptively mix the statistical parameters of the current batch with the global statistics. Our HBN can serve as a powerful plugin to advance federated learning performance. It reflects promising merits across a wide range of federated learning settings, especially for small batch sizes and heterogeneous data.
