Normalizing Batch Normalization for Long-Tailed Recognition
Yuxiang Bao, Guoliang Kang, Linlin Yang, Xiaoyue Duan, Bo Zhao, Baochang Zhang
TL;DR
This work addresses the challenge of long-tailed recognition by revealing that rare-class features can be inherently weaker and biased within standard BN statistics. It introduces Normalizing Batch Normalization (NBN), which decouples the magnitude and direction of BN parameters and normalizes them to balance feature strengths, complemented by logit rectification to reduce classifier bias. Across CIFAR-10/100-LT, ImageNet-LT, and iNaturalist 2018, NBN demonstrates strong, consistent improvements and remains compatible with other long-tailed methods, offering a simple, plug-and-play solution that enhances rare-class performance without sacrificing head accuracy. The approach also shows promise in extending to detection/segmentation on LVIS-V1, indicating broad applicability to long-tailed visual tasks.
Abstract
In real-world scenarios, the number of training samples across classes usually subjects to a long-tailed distribution. The conventionally trained network may achieve unexpected inferior performance on the rare class compared to the frequent class. Most previous works attempt to rectify the network bias from the data-level or from the classifier-level. Differently, in this paper, we identify that the bias towards the frequent class may be encoded into features, i.e., the rare-specific features which play a key role in discriminating the rare class are much weaker than the frequent-specific features. Based on such an observation, we introduce a simple yet effective approach, normalizing the parameters of Batch Normalization (BN) layer to explicitly rectify the feature bias. To achieve this end, we represent the Weight/Bias parameters of a BN layer as a vector, normalize it into a unit one and multiply the unit vector by a scalar learnable parameter. Through decoupling the direction and magnitude of parameters in BN layer to learn, the Weight/Bias exhibits a more balanced distribution and thus the strength of features becomes more even. Extensive experiments on various long-tailed recognition benchmarks (i.e., CIFAR-10/100-LT, ImageNet-LT and iNaturalist 2018) show that our method outperforms previous state-of-the-arts remarkably. The code and checkpoints are available at https://github.com/yuxiangbao/NBN.
