Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray dataset
Ammar Ahmed, Ali Shariq Imran, Mohib Ullah, Zenun Kastrati, Sher Muhammad Daudpota
TL;DR
This work reframes wrist pathology recognition on limited X-ray data as a fine-grained visual recognition problem and introduces a Plug-in Module (PIM) ensemble that identifies discriminative regions with a weakly supervised selector, multi-scale features via an FPN, and graph-based feature fusion. A memory-efficient Evolved Sign Momentum (LION) optimizer is integrated to improve generalization, and the entire approach is evaluated on a carefully curated wrist subset from the GRAZPEDWRI dataset, yielding superior performance over numerous baselines and FGVR methods. Ablation studies demonstrate the contributions of LION, region selections, and FPN configuration, with heatmap analyses showing focused, discriminative regions even in challenging, limited-data scenarios. The results suggest practical potential for automated, explainable wrist pathology recognition that requires minimal manual annotation, with future work aimed at wrist-specific FGVR architectures and larger, image-level annotated datasets to further enhance interpretability and accuracy.
Abstract
The exploration of automated wrist fracture recognition has gained considerable research attention in recent years. In practical medical scenarios, physicians and surgeons may lack the specialized expertise required for accurate X-ray interpretation, highlighting the need for machine vision to enhance diagnostic accuracy. However, conventional recognition techniques face challenges in discerning subtle differences in X-rays when classifying wrist pathologies, as many of these pathologies, such as fractures, can be small and hard to distinguish. This study tackles wrist pathology recognition as a fine-grained visual recognition (FGVR) problem, utilizing a limited, custom-curated dataset that mirrors real-world medical constraints, relying solely on image-level annotations. We introduce a specialized FGVR-based ensemble approach to identify discriminative regions within X-rays. We employ an Explainable AI (XAI) technique called Grad-CAM to pinpoint these regions. Our ensemble approach outperformed many conventional SOTA and FGVR techniques, underscoring the effectiveness of our strategy in enhancing accuracy in wrist pathology recognition.
