Enhanced Multi-Class Classification of Gastrointestinal Endoscopic Images with Interpretable Deep Learning Model
Astitva Kamble, Vani Bandodkar, Saakshi Dharmadhikary, Veena Anand, Pradyut Kumar Sanki, Mei X. Wu, Biswabandhu Jana
TL;DR
This study addresses GI endoscopy image classification by developing an augmentation-free, EfficientNetB3-based network trained on the Kvasir dataset with eight classes, achieving 94.25% test accuracy. It combines strong discriminative performance with interpretability through LIME saliency maps, enabling clinicians to see which image regions drive predictions. The approach balances accuracy, computational efficiency, and transparency, making it suitable for resource-limited clinical settings. Overall, it contributes a compact, interpretable pipeline for GI endoscopy image classification with robust generalization to unseen data.
Abstract
Endoscopy serves as an essential procedure for evaluating the gastrointestinal (GI) tract and plays a pivotal role in identifying GI-related disorders. Recent advancements in deep learning have demonstrated substantial progress in detecting abnormalities through intricate models and data augmentation methods.This research introduces a novel approach to enhance classification accuracy using 8,000 labeled endoscopic images from the Kvasir dataset, categorized into eight distinct classes. Leveraging EfficientNetB3 as the backbone, the proposed architecture eliminates reliance on data augmentation while preserving moderate model complexity. The model achieves a test accuracy of 94.25%, alongside precision and recall of 94.29% and 94.24% respectively. Furthermore, Local Interpretable Model-agnostic Explanation (LIME) saliency maps are employed to enhance interpretability by defining critical regions in the images that influenced model predictions. Overall, this work highlights the importance of AI in advancing medical imaging by combining high classification accuracy with interpretability.
