Attention-Guided Erasing: A Novel Augmentation Method for Enhancing Downstream Breast Density Classification
Adarsh Bhandary Panambur, Hui Yu, Sheethal Bhat, Prathmesh Madhu, Siming Bayer, Andreas Maier
TL;DR
This work tackles automated four-class breast density classification in a Vietnamese population with high-density breasts. It introduces Attention-Guided Erasing (AGE), a data augmentation that uses attention maps from a DINO self-supervised Vision Transformer to erase background regions, thereby highlighting dense tissue during transfer learning. On the VinDr-Mammo dataset, AGE significantly improves performance, achieving a mean Macro F1-score of 0.5910 compared with 0.5594 without erasing and 0.5691 with random erasing, with strong statistical significance ($p<0.0001$). The approach demonstrates that focusing the model on dense-tissue regions via attention-guided masking yields more robust density classification and may generalize to other medical imaging tasks.
Abstract
The assessment of breast density is crucial in the context of breast cancer screening, especially in populations with a higher percentage of dense breast tissues. This study introduces a novel data augmentation technique termed Attention-Guided Erasing (AGE), devised to enhance the downstream classification of four distinct breast density categories in mammography following the BI-RADS recommendation in the Vietnamese cohort. The proposed method integrates supplementary information during transfer learning, utilizing visual attention maps derived from a vision transformer backbone trained using the self-supervised DINO method. These maps are utilized to erase background regions in the mammogram images, unveiling only the potential areas of dense breast tissues to the network. Through the incorporation of AGE during transfer learning with varying random probabilities, we consistently surpass classification performance compared to scenarios without AGE and the traditional random erasing transformation. We validate our methodology using the publicly available VinDr-Mammo dataset. Specifically, we attain a mean F1-score of 0.5910, outperforming values of 0.5594 and 0.5691 corresponding to scenarios without AGE and with random erasing (RE), respectively. This superiority is further substantiated by t-tests, revealing a p-value of p<0.0001, underscoring the statistical significance of our approach.
