Mitigating annotation shift in cancer classification using single image generative models
Marta Buetas Arcas, Richard Osuala, Karim Lekadir, Oliver Díaz
TL;DR
This work tackles annotation shift in breast cancer classification from mammography by simulating shifts via varying annotation tightness, quantifying their impact, and mitigating them with single-image generative models (SinGAN). It demonstrates that malignant-class performance is most sensitive to annotation shifts and that SinGAN-based data augmentation—especially when combined with traditional oversampling in an ensemble—substantially improves robustness, with fidelity of generated images supported by SiFID metrics. The approach requires as few as four in-domain annotations to generate diverse, in-domain variations, addressing data scarcity and class imbalance. Overall, the study shows the feasibility of one-shot generative augmentation to reduce domain shift in medical imaging and informs future strategies for robust CAD in mammography.
Abstract
Artificial Intelligence (AI) has emerged as a valuable tool for assisting radiologists in breast cancer detection and diagnosis. However, the success of AI applications in this domain is restricted by the quantity and quality of available data, posing challenges due to limited and costly data annotation procedures that often lead to annotation shifts. This study simulates, analyses and mitigates annotation shifts in cancer classification in the breast mammography domain. First, a high-accuracy cancer risk prediction model is developed, which effectively distinguishes benign from malignant lesions. Next, model performance is used to quantify the impact of annotation shift. We uncover a substantial impact of annotation shift on multiclass classification performance particularly for malignant lesions. We thus propose a training data augmentation approach based on single-image generative models for the affected class, requiring as few as four in-domain annotations to considerably mitigate annotation shift, while also addressing dataset imbalance. Lastly, we further increase performance by proposing and validating an ensemble architecture based on multiple models trained under different data augmentation regimes. Our study offers key insights into annotation shift in deep learning breast cancer classification and explores the potential of single-image generative models to overcome domain shift challenges.
