Generating Multi-Center Classifier via Conditional Gaussian Distribution
Zhemin Zhang, Xun Gong
TL;DR
This work tackles the limitation of uni-center linear classifiers by formulating a multi-center classifier grounded in a Gaussian Mixture assumption for deep features. For each class, a conditional Gaussian with mean $w_c$ and learnable variance $\sigma_c^2$ seeds $K$ sub-centers via $w_c^{(k)} = w_c + \sigma_c \odot \varepsilon$, and the centers are used as $C(K+1)$ classes during training with a Multi-Center Class Label and a variance regularization term that yields total loss $\mathcal{L} = \mathcal{L}_m + \mathcal{L}_{\sigma^2}$. At test time, sub-centers are discarded and only the class means $w_c$ are kept, keeping inference identical to a vanilla linear classifier. Empirically, the method improves top-1 accuracy on ImageNet-1K for both CNNs and ViTs (e.g., +0.9 percentage points for ResNet-50 and +0.4 points for Swin-T) and remains compatible with data augmentations and softmax variants, demonstrating enhanced intra-class structure without additional test-time cost. The approach offers a practical route to richer feature distributions and reduced over-clustering in large-scale visual recognition tasks.
Abstract
The linear classifier is widely used in various image classification tasks. It works by optimizing the distance between a sample and its corresponding class center. However, in real-world data, one class can contain several local clusters, e.g., birds of different poses. To address this complexity, we propose a novel multi-center classifier. Different from the vanilla linear classifier, our proposal is established on the assumption that the deep features of the training set follow a Gaussian Mixture distribution. Specifically, we create a conditional Gaussian distribution for each class and then sample multiple sub-centers from that distribution to extend the linear classifier. This approach allows the model to capture intra-class local structures more efficiently. In addition, at test time we set the mean of the conditional Gaussian distribution as the class center of the linear classifier and follow the vanilla linear classifier outputs, thus requiring no additional parameters or computational overhead. Extensive experiments on image classification show that the proposed multi-center classifier is a powerful alternative to widely used linear classifiers. Code available at https://github.com/ZheminZhang1/MultiCenter-Classifier.
