Mix from Failure: Confusion-Pairing Mixup for Long-Tailed Recognition
Youngseok Yoon, Sangwoo Hong, Hyungjun Joo, Yao Qin, Haewon Jeong, Jungwoo Lee
TL;DR
This work tackles long-tailed image recognition by addressing model confusion rather than solely adjusting losses or architectures. It introduces Confusion-Pairing Mixup (CP-Mix), which estimates real-time confusion distributions and augments data by mixing samples from confusion pairs, with an imbalance-aware labeling strategy. Through extensive experiments on CIFAR-LT, ImageNet-LT, Places-LT, and iNaturalist 2018, CP-Mix consistently improves minority-class performance and reduces misclassification between confusing class pairs, while remaining compatible with ensemble methods. The approach is simple to implement, model-agnostic, and offers a practical augmentation technique to enhance generalization under severe class imbalance.
Abstract
Long-tailed image recognition is a computer vision problem considering a real-world class distribution rather than an artificial uniform. Existing methods typically detour the problem by i) adjusting a loss function, ii) decoupling classifier learning, or iii) proposing a new multi-head architecture called experts. In this paper, we tackle the problem from a different perspective to augment a training dataset to enhance the sample diversity of minority classes. Specifically, our method, namely Confusion-Pairing Mixup (CP-Mix), estimates the confusion distribution of the model and handles the data deficiency problem by augmenting samples from confusion pairs in real-time. In this way, CP-Mix trains the model to mitigate its weakness and distinguish a pair of classes it frequently misclassifies. In addition, CP-Mix utilizes a novel mixup formulation to handle the bias in decision boundaries that originated from the imbalanced dataset. Extensive experiments demonstrate that CP-Mix outperforms existing methods for long-tailed image recognition and successfully relieves the confusion of the classifier.
