Evaluating Deep Learning Models for Breast Cancer Classification: A Comparative Study
Sania Eskandari, Ali Eslamian, Nusrat Munia, Amjad Alqarni, Qiang Cheng
TL;DR
This work addresses improving breast cancer detection from histopathology by comparing eight deep learning models, including a Vision Transformer, on a large patch-level dataset of IDC images. Using ImageNet-pretrained models and a 10-epoch training regime, the Vision Transformer achieves the highest overall accuracy (~93%), with strong, balanced precision and recall for both IDC and non-IDC classes, outperforming conventional CNNs. The results demonstrate the potential of attention-based architectures to enhance diagnostic accuracy and streamline clinical workflows for breast cancer pathology. The findings suggest future directions in data augmentation, ensembles, interpretability, and broader clinical integration to further improve robustness and real-time applicability.
Abstract
This study evaluates the effectiveness of deep learning models in classifying histopathological images for early and accurate detection of breast cancer. Eight advanced models, including ResNet-50, DenseNet-121, ResNeXt-50, Vision Transformer (ViT), GoogLeNet (Inception v3), EfficientNet, MobileNet, and SqueezeNet, were compared using a dataset of 277,524 image patches. The Vision Transformer (ViT) model, with its attention-based mechanisms, achieved the highest validation accuracy of 94%, outperforming conventional CNNs. The study demonstrates the potential of advanced machine learning methods to enhance precision and efficiency in breast cancer diagnosis in clinical settings.
