Saliency-guided and Patch-based Mixup for Long-tailed Skin Cancer Image Classification
Tianyunxi Wei, Yijin Huang, Li Lin, Pujin Cheng, Sirui Li, Xiaoying Tang
TL;DR
This work addresses long-tailed skin cancer image classification by introducing SPMix, a saliency-guided and patch-based mixup framework. It blends tail-class samples with head-class backgrounds at the feature level, guided by lesion saliency maps, and uses per-patch mixup ratios with $r = \min(\alpha, \max(s_h, s_t))$ along with patch-wise ratios $r_i = \text{avg}(s_i)$, followed by transformer-based representation learning and a supervised contrastive loss. The key contributions are the saliency-guided mixup mechanism, lesion-aware per-patch mixing, and an integrated SCL framework that yields improved tail performance while preserving head-class accuracy, demonstrated on the ISIC2018 dataset with significant gains over prior methods. The approach has practical impact for medical image analysis where data imbalance is common and lesion-focused diagnostics are critical, offering a robust augmentation strategy that preserves diagnostic features in tail classes.
Abstract
Medical image datasets often exhibit long-tailed distributions due to the inherent challenges in medical data collection and annotation. In long-tailed contexts, some common disease categories account for most of the data, while only a few samples are available in the rare disease categories, resulting in poor performance of deep learning methods. To address this issue, previous approaches have employed class re-sampling or re-weighting techniques, which often encounter challenges such as overfitting to tail classes or difficulties in optimization during training. In this work, we propose a novel approach, namely \textbf{S}aliency-guided and \textbf{P}atch-based \textbf{Mix}up (SPMix) for long-tailed skin cancer image classification. Specifically, given a tail-class image and a head-class image, we generate a new tail-class image by mixing them under the guidance of saliency mapping, which allows for preserving and augmenting the discriminative features of the tail classes without any interference of the head-class features. Extensive experiments are conducted on the ISIC2018 dataset, demonstrating the superiority of SPMix over existing state-of-the-art methods.
