Table of Contents
Fetching ...

Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules

Fariza Dahes

TL;DR

This work tests the transposability of Xia et al.'s GAGM and SEVector enhancements, originally designed for CPU-friendly medical imaging, to mammography classification using a Kaggle-derived dataset. It extends the framework with multi-metric evaluation, Grad-CAM interpretability, and an interactive dashboard, while also evaluating the role of Feature Smoothing Loss. Across backbones, InceptionV3 emerges as the most accurate and robust model, achieving perfect or near-perfect ROC-AUC across classes, though at higher computational cost; ConvNeXt-Tiny shows strong but less stable performance, and the baseline CNN remains regular but less competitive. The study highlights the importance of latent-space organization and explainability, and suggests future work on lightweight, generalizable, and more interpretable approaches with external validation in diverse clinical settings.

Abstract

This study presents a validation and extension of a recent methodological framework for medical image classification. While an improved ConvNeXt Tiny architecture, integrating Global Average and Max Pooling fusion (GAGM), lightweight channel attention (SEVector), and Feature Smoothing Loss (FSL), demonstrated promising results on Alzheimer MRI under CPU friendly conditions, our work investigates its transposability to mammography classification. Using a Kaggle dataset that consolidates INbreast, MIAS, and DDSM mammography collections, we compare a baseline CNN, ConvNeXt Tiny, and InceptionV3 backbones enriched with GAGM and SEVector modules. Results confirm the effectiveness of GAGM and SEVector in enhancing feature discriminability and reducing false negatives, particularly for malignant cases. In our experiments, however, the Feature Smoothing Loss did not yield measurable improvements under mammography classification conditions, suggesting that its effectiveness may depend on specific architectural and computational assumptions. Beyond validation, our contribution extends the original framework through multi metric evaluation (macro F1, per class recall variance, ROC/AUC), feature interpretability analysis (Grad CAM), and the development of an interactive dashboard for clinical exploration. As a perspective, we highlight the need to explore alternative approaches to improve intra class compactness and inter class separability, with the specific goal of enhancing the distinction between malignant and benign cases in mammography classification.

Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules

TL;DR

This work tests the transposability of Xia et al.'s GAGM and SEVector enhancements, originally designed for CPU-friendly medical imaging, to mammography classification using a Kaggle-derived dataset. It extends the framework with multi-metric evaluation, Grad-CAM interpretability, and an interactive dashboard, while also evaluating the role of Feature Smoothing Loss. Across backbones, InceptionV3 emerges as the most accurate and robust model, achieving perfect or near-perfect ROC-AUC across classes, though at higher computational cost; ConvNeXt-Tiny shows strong but less stable performance, and the baseline CNN remains regular but less competitive. The study highlights the importance of latent-space organization and explainability, and suggests future work on lightweight, generalizable, and more interpretable approaches with external validation in diverse clinical settings.

Abstract

This study presents a validation and extension of a recent methodological framework for medical image classification. While an improved ConvNeXt Tiny architecture, integrating Global Average and Max Pooling fusion (GAGM), lightweight channel attention (SEVector), and Feature Smoothing Loss (FSL), demonstrated promising results on Alzheimer MRI under CPU friendly conditions, our work investigates its transposability to mammography classification. Using a Kaggle dataset that consolidates INbreast, MIAS, and DDSM mammography collections, we compare a baseline CNN, ConvNeXt Tiny, and InceptionV3 backbones enriched with GAGM and SEVector modules. Results confirm the effectiveness of GAGM and SEVector in enhancing feature discriminability and reducing false negatives, particularly for malignant cases. In our experiments, however, the Feature Smoothing Loss did not yield measurable improvements under mammography classification conditions, suggesting that its effectiveness may depend on specific architectural and computational assumptions. Beyond validation, our contribution extends the original framework through multi metric evaluation (macro F1, per class recall variance, ROC/AUC), feature interpretability analysis (Grad CAM), and the development of an interactive dashboard for clinical exploration. As a perspective, we highlight the need to explore alternative approaches to improve intra class compactness and inter class separability, with the specific goal of enhancing the distinction between malignant and benign cases in mammography classification.

Paper Structure

This paper contains 12 sections, 4 equations, 16 figures, 2 tables.

Figures (16)

  • Figure 1: Most Common Cancer Site per Country (2022). Source: International Agency for Research on Cancer (IARC), WHO.
  • Figure 2: Patient Pathway and Imaging Modalities in Breast Cancer Care. Sequential overview of the patient journey in breast cancer diagnosis and treatment, highlighting the imaging techniques used at each stage.
  • Figure 3: Class Distribution Before and After Targeted Augmentation — Comparison of class proportions before and after targeted data augmentation on the normal class. Initially underrepresented (7.6%), the normal class was synthetically enriched to reach 33.1%, reducing class imbalance and improving model exposure to low‑frequency patterns. This strategy supports better generalization and recall for the minority class, which is critical in screening contexts where false negatives carry high clinical risk.
  • Figure 4: Pairwise Scatter Plots Comparing Model Performance — Accuracy vs Loss, Macro F1 vs Minimum Recall, and Recall Dispersion vs Overfitting Loss. These visualizations highlight the trade‑offs between precision, robustness, and equity across the three models.
  • Figure 5: Radar Plots Comparing Global Performance, Class Coverage, and Overfitting Across Models. Improved InceptionV3 (IIV3) consistently forms near‑ideal polygons, reflecting high precision, equitable class treatment, and robust generalization. ICNT shows competitive global performance but suffers from recall dispersion and overfitting. The baseline CNN offers balanced but lower scores, serving as a stable reference.
  • ...and 11 more figures