Table of Contents
Fetching ...

Cross-Species Data Integration for Enhanced Layer Segmentation in Kidney Pathology

Junchao Zhu, Mengmeng Yin, Ruining Deng, Yitian Long, Yu Wang, Yaohong Wang, Shilin Zhao, Haichun Yang, Yuankai Huo

TL;DR

This work tackles the challenge of limited annotated human kidney histopathology data for cortex–medulla layer segmentation by leveraging cross-species homologous data from mouse kidneys. It introduces a Cross-species Training Framework that jointly trains on human and mouse PAS-stained images, employing a hybrid loss $L_s$ for independent tasks and a weighted Focal loss based joint loss $L_t$ to handle class imbalance, across CNN and Transformer architectures. Empirical results show consistent improvements in both mIoU and Dice scores for the human cortex and medulla, along with enhanced model generalization when cross-species data are used. The findings demonstrate that cross-species, low-noise data can augment learning under limited clinical samples, with practical implications for kidney pathology analysis and broader cross-species data integration in medical image segmentation, supported by publicly available code $\left(https://github.com/hrlblab/layer_segmentation\right)$.

Abstract

Accurate delineation of the boundaries between the renal cortex and medulla is crucial for subsequent functional structural analysis and disease diagnosis. Training high-quality deep-learning models for layer segmentation relies on the availability of large amounts of annotated data. However, due to the patient's privacy of medical data and scarce clinical cases, constructing pathological datasets from clinical sources is relatively difficult and expensive. Moreover, using external natural image datasets introduces noise during the domain generalization process. Cross-species homologous data, such as mouse kidney data, which exhibits high structural and feature similarity to human kidneys, has the potential to enhance model performance on human datasets. In this study, we incorporated the collected private Periodic Acid-Schiff (PAS) stained mouse kidney dataset into the human kidney dataset for joint training. The results showed that after introducing cross-species homologous data, the semantic segmentation models based on CNN and Transformer architectures achieved an average increase of 1.77% and 1.24% in mIoU, and 1.76% and 0.89% in Dice score for the human renal cortex and medulla datasets, respectively. This approach is also capable of enhancing the model's generalization ability. This indicates that cross-species homologous data, as a low-noise trainable data source, can help improve model performance under conditions of limited clinical samples. Code is available at https://github.com/hrlblab/layer_segmentation.

Cross-Species Data Integration for Enhanced Layer Segmentation in Kidney Pathology

TL;DR

This work tackles the challenge of limited annotated human kidney histopathology data for cortex–medulla layer segmentation by leveraging cross-species homologous data from mouse kidneys. It introduces a Cross-species Training Framework that jointly trains on human and mouse PAS-stained images, employing a hybrid loss for independent tasks and a weighted Focal loss based joint loss to handle class imbalance, across CNN and Transformer architectures. Empirical results show consistent improvements in both mIoU and Dice scores for the human cortex and medulla, along with enhanced model generalization when cross-species data are used. The findings demonstrate that cross-species, low-noise data can augment learning under limited clinical samples, with practical implications for kidney pathology analysis and broader cross-species data integration in medical image segmentation, supported by publicly available code .

Abstract

Accurate delineation of the boundaries between the renal cortex and medulla is crucial for subsequent functional structural analysis and disease diagnosis. Training high-quality deep-learning models for layer segmentation relies on the availability of large amounts of annotated data. However, due to the patient's privacy of medical data and scarce clinical cases, constructing pathological datasets from clinical sources is relatively difficult and expensive. Moreover, using external natural image datasets introduces noise during the domain generalization process. Cross-species homologous data, such as mouse kidney data, which exhibits high structural and feature similarity to human kidneys, has the potential to enhance model performance on human datasets. In this study, we incorporated the collected private Periodic Acid-Schiff (PAS) stained mouse kidney dataset into the human kidney dataset for joint training. The results showed that after introducing cross-species homologous data, the semantic segmentation models based on CNN and Transformer architectures achieved an average increase of 1.77% and 1.24% in mIoU, and 1.76% and 0.89% in Dice score for the human renal cortex and medulla datasets, respectively. This approach is also capable of enhancing the model's generalization ability. This indicates that cross-species homologous data, as a low-noise trainable data source, can help improve model performance under conditions of limited clinical samples. Code is available at https://github.com/hrlblab/layer_segmentation.
Paper Structure (11 sections, 2 equations, 4 figures, 1 table)

This paper contains 11 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: The feature distribution of the cortex and medulla in human and mouse kidney images at the patch level. Human and mouse data points have a high overlap in the PCA space, indicating a high degree of cross-species similarity in the homologous tissue structure.
  • Figure 2: Framework of the Cross-Species Training Process: Patches are first extracted from WSIs and then are utilized to train and test across several baseline models, which includes individual treatments for each species' data as well as combined data processing.
  • Figure 3: The distribution of mIoU and Dice score across test datasets by different training manner. The collaborative training approach led to a more concentrated distribution of these metrics.
  • Figure 4: The qualitative outcomes of models on various datasets. By utilizing external homologous data, the models have become better at perceiving edge textures, thus performing better in more precise localization and identification of kidney layer boundaries.