SMCL: Saliency Masked Contrastive Learning for Long-tailed Recognition
Sanglee Park, Seung-won Hwang, Jungmin So
TL;DR
The paper addresses long-tailed recognition where biased background features cause predictions to favor major classes. It introduces Saliency Masked Contrastive Learning (SMCL), which masks salient image regions and uses a minor-class biased sampling alongside a mixed loss that combines $L_{MCE}$ and $L_{MSC}$ to pull masked backgrounds toward minor classes in feature space. Empirical results on CIFAR-10-LT, CIFAR-100-LT, and ImageNet-LT demonstrate competitive or state-of-the-art performance, with ablations confirming the effectiveness of saliency masking and the contrastive objective. The approach is simple to implement and enhances generalization by mitigating background-feature bias, offering tangible benefits for real-world long-tailed recognition tasks.
Abstract
Real-world data often follow a long-tailed distribution with a high imbalance in the number of samples between classes. The problem with training from imbalanced data is that some background features, common to all classes, can be unobserved in classes with scarce samples. As a result, this background correlates to biased predictions into ``major" classes. In this paper, we propose saliency masked contrastive learning, a new method that uses saliency masking and contrastive learning to mitigate the problem and improve the generalizability of a model. Our key idea is to mask the important part of an image using saliency detection and use contrastive learning to move the masked image towards minor classes in the feature space, so that background features present in the masked image are no longer correlated with the original class. Experiment results show that our method achieves state-of-the-art level performance on benchmark long-tailed datasets.
