Towards Accurate and Interpretable Neuroblastoma Diagnosis via Contrastive Multi-scale Pathological Image Analysis
Zhu Zhu, Shuo Jiang, Jingyuan Zheng, Yawen Li, Yifei Chen, Manli Zhao, Weizhong Gu, Feiwei Qin, Jinhu Wang, Gang Yu
TL;DR
Neuroblastoma subtyping from H&E WSIs is hampered by subjective pathology and computationally heavy, poorly interpretable models. The authors propose CMSwinKAN, a lightweight, interpretable architecture that combines Swin KANsformer blocks, a contrastive-driven multi-scale fusion module, and a KAN-based classification head, augmented by a tissue-aware soft voting scheme for WSI-level diagnosis. Key contributions include a CMSA module for dynamic multi-scale fusion, a spline-based KAN head for improved nonlinear modeling, and an SVM-guided soft voting framework that maps patch-level predictions to slide-level labels with clinical priors. Experiments on the private PpNTs dataset and BreakHis demonstrate superior performance and generalization over state-of-the-art pathology models, supporting the method’s clinical relevance and offering a path toward robust AI-assisted NB diagnosis.
Abstract
Neuroblastoma, adrenal-derived, is among the most common pediatric solid malignancies, characterized by significant clinical heterogeneity. Timely and accurate pathological diagnosis from hematoxylin and eosin-stained whole-slide images is critical for patient prognosis. However, current diagnostic practices primarily rely on subjective manual examination by pathologists, leading to inconsistent accuracy. Existing automated whole-slide image classification methods encounter challenges such as poor interpretability, limited feature extraction capabilities, and high computational costs, restricting their practical clinical deployment. To overcome these limitations, we propose CMSwinKAN, a contrastive-learning-based multi-scale feature fusion model tailored for pathological image classification, which enhances the Swin Transformer architecture by integrating a Kernel Activation Network within its multilayer perceptron and classification head modules, significantly improving both interpretability and accuracy. By fusing multi-scale features and leveraging contrastive learning strategies, CMSwinKAN mimics clinicians' comprehensive approach, effectively capturing global and local tissue characteristics. Additionally, we introduce a heuristic soft voting mechanism guided by clinical insights to bridge patch-level predictions to whole-slide image-level classifications seamlessly. We verified the CMSwinKAN on the publicly available BreakHis dataset and the PpNTs dataset, which was established by our hospital. Results demonstrate that CMSwinKAN performs better than existing state-of-the-art pathology-specific models pre-trained on large datasets. Our source code is available at https://github.com/JSLiam94/CMSwinKAN.
