Table of Contents
Fetching ...

Towards Accurate and Interpretable Neuroblastoma Diagnosis via Contrastive Multi-scale Pathological Image Analysis

Zhu Zhu, Shuo Jiang, Jingyuan Zheng, Yawen Li, Yifei Chen, Manli Zhao, Weizhong Gu, Feiwei Qin, Jinhu Wang, Gang Yu

TL;DR

Neuroblastoma subtyping from H&E WSIs is hampered by subjective pathology and computationally heavy, poorly interpretable models. The authors propose CMSwinKAN, a lightweight, interpretable architecture that combines Swin KANsformer blocks, a contrastive-driven multi-scale fusion module, and a KAN-based classification head, augmented by a tissue-aware soft voting scheme for WSI-level diagnosis. Key contributions include a CMSA module for dynamic multi-scale fusion, a spline-based KAN head for improved nonlinear modeling, and an SVM-guided soft voting framework that maps patch-level predictions to slide-level labels with clinical priors. Experiments on the private PpNTs dataset and BreakHis demonstrate superior performance and generalization over state-of-the-art pathology models, supporting the method’s clinical relevance and offering a path toward robust AI-assisted NB diagnosis.

Abstract

Neuroblastoma, adrenal-derived, is among the most common pediatric solid malignancies, characterized by significant clinical heterogeneity. Timely and accurate pathological diagnosis from hematoxylin and eosin-stained whole-slide images is critical for patient prognosis. However, current diagnostic practices primarily rely on subjective manual examination by pathologists, leading to inconsistent accuracy. Existing automated whole-slide image classification methods encounter challenges such as poor interpretability, limited feature extraction capabilities, and high computational costs, restricting their practical clinical deployment. To overcome these limitations, we propose CMSwinKAN, a contrastive-learning-based multi-scale feature fusion model tailored for pathological image classification, which enhances the Swin Transformer architecture by integrating a Kernel Activation Network within its multilayer perceptron and classification head modules, significantly improving both interpretability and accuracy. By fusing multi-scale features and leveraging contrastive learning strategies, CMSwinKAN mimics clinicians' comprehensive approach, effectively capturing global and local tissue characteristics. Additionally, we introduce a heuristic soft voting mechanism guided by clinical insights to bridge patch-level predictions to whole-slide image-level classifications seamlessly. We verified the CMSwinKAN on the publicly available BreakHis dataset and the PpNTs dataset, which was established by our hospital. Results demonstrate that CMSwinKAN performs better than existing state-of-the-art pathology-specific models pre-trained on large datasets. Our source code is available at https://github.com/JSLiam94/CMSwinKAN.

Towards Accurate and Interpretable Neuroblastoma Diagnosis via Contrastive Multi-scale Pathological Image Analysis

TL;DR

Neuroblastoma subtyping from H&E WSIs is hampered by subjective pathology and computationally heavy, poorly interpretable models. The authors propose CMSwinKAN, a lightweight, interpretable architecture that combines Swin KANsformer blocks, a contrastive-driven multi-scale fusion module, and a KAN-based classification head, augmented by a tissue-aware soft voting scheme for WSI-level diagnosis. Key contributions include a CMSA module for dynamic multi-scale fusion, a spline-based KAN head for improved nonlinear modeling, and an SVM-guided soft voting framework that maps patch-level predictions to slide-level labels with clinical priors. Experiments on the private PpNTs dataset and BreakHis demonstrate superior performance and generalization over state-of-the-art pathology models, supporting the method’s clinical relevance and offering a path toward robust AI-assisted NB diagnosis.

Abstract

Neuroblastoma, adrenal-derived, is among the most common pediatric solid malignancies, characterized by significant clinical heterogeneity. Timely and accurate pathological diagnosis from hematoxylin and eosin-stained whole-slide images is critical for patient prognosis. However, current diagnostic practices primarily rely on subjective manual examination by pathologists, leading to inconsistent accuracy. Existing automated whole-slide image classification methods encounter challenges such as poor interpretability, limited feature extraction capabilities, and high computational costs, restricting their practical clinical deployment. To overcome these limitations, we propose CMSwinKAN, a contrastive-learning-based multi-scale feature fusion model tailored for pathological image classification, which enhances the Swin Transformer architecture by integrating a Kernel Activation Network within its multilayer perceptron and classification head modules, significantly improving both interpretability and accuracy. By fusing multi-scale features and leveraging contrastive learning strategies, CMSwinKAN mimics clinicians' comprehensive approach, effectively capturing global and local tissue characteristics. Additionally, we introduce a heuristic soft voting mechanism guided by clinical insights to bridge patch-level predictions to whole-slide image-level classifications seamlessly. We verified the CMSwinKAN on the publicly available BreakHis dataset and the PpNTs dataset, which was established by our hospital. Results demonstrate that CMSwinKAN performs better than existing state-of-the-art pathology-specific models pre-trained on large datasets. Our source code is available at https://github.com/JSLiam94/CMSwinKAN.

Paper Structure

This paper contains 25 sections, 12 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Overall architecture of the CMSwinKAN model. The model consists mainly of the Swin KANsformer block and the contrast-driven Multi-Scale Feature Fusion Module. Within the Swin KANsformer block, W-MSA and SW-MSA are multi-head self-attention modules with regular and shifted window configurations, respectively.
  • Figure 2: The architecture of the proposed WSI voting mechanism. After feature classification of the slices using an SVM classifier, weights are dynamically assigned based on classification confidence, ultimately implementing a weighted soft voting strategy for decision.
  • Figure 3: Medical data processing workflow based on pathological sections, which includes four main stages: (1) Data Collection, covering the collection of clinical patient tissue samples, tissue sample processing, and staining, as well as WSI scanning; (2) WSI preprocessing, involving physician evaluation and screening of WSIs and data organization; (3) Patch-level Data Construction, including determining patch size and sliding window segmentation of slices; (4) Dataset Preprocessing, covering invalid region filtering and dataset splitting, ultimately providing structured data support for model training and evaluation.
  • Figure 4: Visual representation of our private PpNTs dataset.
  • Figure 5: Relationship between ACC, parameters, and FLOPs of CMSwinKAN and other advanced models on our private PpNTs dataset. The size of each circle represents the number of parameters for each model.
  • ...and 3 more figures