Three Heads Are Better Than One: Complementary Experts for Long-Tailed Semi-supervised Learning
Chengcheng Ma, Ismail Elezi, Jiankang Deng, Weiming Dong, Changsheng Xu
TL;DR
This work tackles long-tailed semi-supervised learning (LTSSL) by addressing the mismatch between labeled and unlabeled data distributions that biases pseudo-labels toward head classes. It introduces ComPlementary Experts (CPE), a multi-head framework where three experts with distinct logit adjustments model different distribution shapes, complemented by Classwise Batch Normalization (CBN) to stabilize tail-class features. The method achieves state-of-the-art performance on CIFAR-10-LT, CIFAR-100-LT, and STL-10-LT across consistent, uniform, and inverse unlabeled distributions, with notable gains when distributions diverge. The approach demonstrates that combining distribution-specific experts with class-aware normalization yields more reliable pseudo-labels and improved representation learning in LTSSL, with only modest training-time overhead; code is released at the provided repository.
Abstract
We address the challenging problem of Long-Tailed Semi-Supervised Learning (LTSSL) where labeled data exhibit imbalanced class distribution and unlabeled data follow an unknown distribution. Unlike in balanced SSL, the generated pseudo-labels are skewed towards head classes, intensifying the training bias. Such a phenomenon is even amplified as more unlabeled data will be mislabeled as head classes when the class distribution of labeled and unlabeled datasets are mismatched. To solve this problem, we propose a novel method named ComPlementary Experts (CPE). Specifically, we train multiple experts to model various class distributions, each of them yielding high-quality pseudo-labels within one form of class distribution. Besides, we introduce Classwise Batch Normalization for CPE to avoid performance degradation caused by feature distribution mismatch between head and non-head classes. CPE achieves state-of-the-art performances on CIFAR-10-LT, CIFAR-100-LT, and STL-10-LT dataset benchmarks. For instance, on CIFAR-10-LT, CPE improves test accuracy by over 2.22% compared to baselines. Code is available at https://github.com/machengcheng2016/CPE-LTSSL.
