Table of Contents
Fetching ...

MIAFEx: An Attention-based Feature Extraction Method for Medical Image Classification

Oscar Ramos-Soto, Jorge Ramos-Frutos, Ezequiel Perez-Zarate, Diego Oliva, Sandra E. Balderas-Mata

TL;DR

MIAFEx introduces a learnable CLS token refinement within a Transformer encoder to enhance feature extraction for medical image classification, addressing data scarcity and intra-class variability. The method is evaluated against classical descriptors with ML, and against CNNs and ViT with DL baselines, across seven diverse medical imaging datasets, using wrapper-based feature selection to further boost performance. Results show MIAFEx, especially when paired with WFS-DE or WFS-GA, consistently outperforms traditional descriptors and many DL models on small-to-medium datasets, while large-scale DL models like ViT can dominate only as data size grows. The work demonstrates the practicality and robustness of a lightweight, interpretable, and data-efficient approach for clinical contexts, with a clear path to integration and future extensions to larger datasets and additional modalities.

Abstract

Feature extraction techniques are crucial in medical image classification; however, classical feature extractors, in addition to traditional machine learning classifiers, often exhibit significant limitations in providing sufficient discriminative information for complex image sets. While Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) have shown promise in feature extraction, they are prone to overfitting due to the inherent characteristics of medical imaging data, including small sample sizes or high intra-class variance. In this work, the Medical Image Attention-based Feature Extractor (MIAFEx) is proposed, a novel method that employs a learnable refinement mechanism to enhance the classification token within the Transformer encoder architecture. This mechanism adjusts the token based on learned weights, improving the extraction of salient features and enhancing the model's adaptability to the challenges presented by medical imaging data. The MIAFEx output feature quality is compared against classical feature extractors using traditional and hybrid classifiers. Also, the performance of these features is compared against modern CNN and ViT models in classification tasks, demonstrating their superiority in accuracy and robustness across multiple complex medical imaging datasets. This advantage is particularly pronounced in scenarios with limited training data, where traditional and modern models often struggle to generalize effectively. The source code of this proposal can be found at https://github.com/Oscar-RamosS/Medical-Image-Attention-based-Feature-Extractor-MIAFEx

MIAFEx: An Attention-based Feature Extraction Method for Medical Image Classification

TL;DR

MIAFEx introduces a learnable CLS token refinement within a Transformer encoder to enhance feature extraction for medical image classification, addressing data scarcity and intra-class variability. The method is evaluated against classical descriptors with ML, and against CNNs and ViT with DL baselines, across seven diverse medical imaging datasets, using wrapper-based feature selection to further boost performance. Results show MIAFEx, especially when paired with WFS-DE or WFS-GA, consistently outperforms traditional descriptors and many DL models on small-to-medium datasets, while large-scale DL models like ViT can dominate only as data size grows. The work demonstrates the practicality and robustness of a lightweight, interpretable, and data-efficient approach for clinical contexts, with a clear path to integration and future extensions to larger datasets and additional modalities.

Abstract

Feature extraction techniques are crucial in medical image classification; however, classical feature extractors, in addition to traditional machine learning classifiers, often exhibit significant limitations in providing sufficient discriminative information for complex image sets. While Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) have shown promise in feature extraction, they are prone to overfitting due to the inherent characteristics of medical imaging data, including small sample sizes or high intra-class variance. In this work, the Medical Image Attention-based Feature Extractor (MIAFEx) is proposed, a novel method that employs a learnable refinement mechanism to enhance the classification token within the Transformer encoder architecture. This mechanism adjusts the token based on learned weights, improving the extraction of salient features and enhancing the model's adaptability to the challenges presented by medical imaging data. The MIAFEx output feature quality is compared against classical feature extractors using traditional and hybrid classifiers. Also, the performance of these features is compared against modern CNN and ViT models in classification tasks, demonstrating their superiority in accuracy and robustness across multiple complex medical imaging datasets. This advantage is particularly pronounced in scenarios with limited training data, where traditional and modern models often struggle to generalize effectively. The source code of this proposal can be found at https://github.com/Oscar-RamosS/Medical-Image-Attention-based-Feature-Extractor-MIAFEx
Paper Structure (33 sections, 13 equations, 13 figures, 19 tables)

This paper contains 33 sections, 13 equations, 13 figures, 19 tables.

Figures (13)

  • Figure 1: General diagram of the MIAFEx.
  • Figure 2: Histological biopsy images.
  • Figure 3: Ocular alignment images.
  • Figure 4: Eye fundus images.
  • Figure 5: Breast ultrasound images.
  • ...and 8 more figures