AGGRNet: Selective Feature Extraction and Aggregation for Enhanced Medical Image Classification
Ansh Makwe, Akansh Agrawal, Prateek Jain, Akshan Agrawal, Priyanka Bagade
TL;DR
AGGRNet targets the challenge of fine-grained medical image classification under inter-class similarity and intra-class variability by introducing a dedicated Feature Extraction and Aggregation (FEA) module that separates informative from non-informative features and fuses global context via cross-attention. The FEM and FAM components, aided by an adaptive threshold and a cross-stage partial channel attention (C2PCA) block, enable selective feature emphasis and robust global reasoning when integrated into a CNN backbone (YOLOv11) for medical imaging tasks. Across LIMUC, ISIC2018, Kvasir, PathMNIST, and RetinaMNIST, AGGRNet achieves state-of-the-art performance, including up to +5% accuracy gains on Kvasir and notable improvements on other datasets, validated through extensive ablations showing the effectiveness of each module. The work demonstrates practical impact for improving disease-subtype classification and severity grading, potentially reducing subjective bias in clinical decision-making by providing more reliable and interpretable feature representations.
Abstract
Medical image analysis for complex tasks such as severity grading and disease subtype classification poses significant challenges due to intricate and similar visual patterns among classes, scarcity of labeled data, and variability in expert interpretations. Despite the usefulness of existing attention-based models in capturing complex visual patterns for medical image classification, underlying architectures often face challenges in effectively distinguishing subtle classes since they struggle to capture inter-class similarity and intra-class variability, resulting in incorrect diagnosis. To address this, we propose AGGRNet framework to extract informative and non-informative features to effectively understand fine-grained visual patterns and improve classification for complex medical image analysis tasks. Experimental results show that our model achieves state-of-the-art performance on various medical imaging datasets, with the best improvement up to 5% over SOTA models on the Kvasir dataset.
