Table of Contents
Fetching ...

Large-scale cervical precancerous screening via AI-assisted cytology whole slide image analysis

Honglin Li, Yusuan Sun, Chenglu Zhu, Yunlong Zhang, Shichuan Zhang, Zhongyi Shui, Pingyi Chen, Jingxiong Li, Sunyi Zheng, Can Cui, Lin Yang

TL;DR

STRIDE addresses the bottleneck of limited annotations by integrating patient-level labels with a small portion of cell-level labels through an end-to-end training strategy, facilitating scalable learning across extensive datasets, and generates explanatory textual descriptions that simulates pathologists' diagnostic processes by cell image feature and textual description alignment.

Abstract

Cervical Cancer continues to be the leading gynecological malignancy, posing a persistent threat to women's health on a global scale. Early screening via cytology Whole Slide Image (WSI) diagnosis is critical to prevent this Cancer progression and improve survival rate, but pathologist's single test suffers inevitable false negative due to the immense number of cells that need to be reviewed within a WSI. Though computer-aided automated diagnostic models can serve as strong complement for pathologists, their effectiveness is hampered by the paucity of extensive and detailed annotations, coupled with the limited interpretability and robustness. These factors significantly hinder their practical applicability and reliability in clinical settings. To tackle these challenges, we develop an AI approach, which is a Scalable Technology for Robust and Interpretable Diagnosis built on Extensive data (STRIDE) of cervical cytology. STRIDE addresses the bottleneck of limited annotations by integrating patient-level labels with a small portion of cell-level labels through an end-to-end training strategy, facilitating scalable learning across extensive datasets. To further improve the robustness to real-world domain shifts of cytology slide-making and imaging, STRIDE employs color adversarial samples training that mimic staining and imaging variations. Lastly, to achieve pathologist-level interpretability for the trustworthiness in clinical settings, STRIDE can generate explanatory textual descriptions that simulates pathologists' diagnostic processes by cell image feature and textual description alignment. Conducting extensive experiments and evaluations in 183 medical centers with a dataset of 341,889 WSIs and 0.1 billion cells from cervical cytology patients, STRIDE has demonstrated a remarkable superiority over previous state-of-the-art techniques.

Large-scale cervical precancerous screening via AI-assisted cytology whole slide image analysis

TL;DR

STRIDE addresses the bottleneck of limited annotations by integrating patient-level labels with a small portion of cell-level labels through an end-to-end training strategy, facilitating scalable learning across extensive datasets, and generates explanatory textual descriptions that simulates pathologists' diagnostic processes by cell image feature and textual description alignment.

Abstract

Cervical Cancer continues to be the leading gynecological malignancy, posing a persistent threat to women's health on a global scale. Early screening via cytology Whole Slide Image (WSI) diagnosis is critical to prevent this Cancer progression and improve survival rate, but pathologist's single test suffers inevitable false negative due to the immense number of cells that need to be reviewed within a WSI. Though computer-aided automated diagnostic models can serve as strong complement for pathologists, their effectiveness is hampered by the paucity of extensive and detailed annotations, coupled with the limited interpretability and robustness. These factors significantly hinder their practical applicability and reliability in clinical settings. To tackle these challenges, we develop an AI approach, which is a Scalable Technology for Robust and Interpretable Diagnosis built on Extensive data (STRIDE) of cervical cytology. STRIDE addresses the bottleneck of limited annotations by integrating patient-level labels with a small portion of cell-level labels through an end-to-end training strategy, facilitating scalable learning across extensive datasets. To further improve the robustness to real-world domain shifts of cytology slide-making and imaging, STRIDE employs color adversarial samples training that mimic staining and imaging variations. Lastly, to achieve pathologist-level interpretability for the trustworthiness in clinical settings, STRIDE can generate explanatory textual descriptions that simulates pathologists' diagnostic processes by cell image feature and textual description alignment. Conducting extensive experiments and evaluations in 183 medical centers with a dataset of 341,889 WSIs and 0.1 billion cells from cervical cytology patients, STRIDE has demonstrated a remarkable superiority over previous state-of-the-art techniques.
Paper Structure (1 section, 8 equations, 12 figures, 2 tables, 2 algorithms)

This paper contains 1 section, 8 equations, 12 figures, 2 tables, 2 algorithms.

Table of Contents

  1. References

Figures (12)

  • Figure 1: Overview of the model development and evaluation. a. Model development. STRIDE takes digital slide of Cervical cytology as input and outputs the WSI-level diagnosis probability and the detected lesions cells with corresponding textual description; STRIDE was trained with confirmed patient-level WSI labels and cell-level box, category and textual labels. b. Data collection and evaluation for model. We evaluate the performance of STRIDE on the internal test data, multicenters unseen test centers with staining variation, real-world external test with multicenters after model deployment and clinical trial assisting pathologists' diagnosis.
  • Figure 2: Overview of STRIDE. a. Top-K cells distillation of WSI. The foreground regions of WSI are patched and then perform lesion-positive cell object detection to select top-K representative cell patches. b. The model structure and SWIFT training scheme. We first train cell classifier with all annotated cells, then fine-tune the cell representation backbone and the WSI classification head in an end-to-end way by combining the Semi-weakly supervised integrated fine-tuning, where the labelled cells comprise the supervised stream and the unlabelled top-K cells of a WSI are the weak-supervised stream. The supervised stream is trained via strong augmentation in a traditional way, while in the meantime, the weak-supervised stream is trained via two components: 1) the WSI level end-to-end MIL training; 2) the cell level Semi-weakly supervised training. c. The ColorAdv steps: color adversarial example generation and generalized model learning, which are alternatively optimized. The first step involves identifying color adversarial examples by maximizing the loss, while the second step involves updating the model using a mixture of generated color adversarial examples and original examples. d. The cell image and text description feature alignment. The visual cell classification model is frozen since it captures morphological features by former training steps. A language model encoding the interpretable text into feature vectors is fine-tuned to align the text features with the visual features.
  • Figure 3: WSI diagnosis results with different WSI-head architectures and cell level features.a. WSI feature T-SNE decomposition and AUC details on different WSI-head with different cell level embeddings (cell-level labels supervised pre-trained and WSI-level labels fine-tuned). b. The corresponding binary classification ROC curve. c. WSI confusion matrix comparison between cell level annotation pre-trained (left) and WSI-level fine-tuned (right) embedding, given similar specificity (0.65).
  • Figure 4: WSI diagnosis performance and robustness on internal test data.a. Semi-/Self- supervised learning features on given cell annotations and unlabelled top-K cells of WSIs. b. WSI top-K fine-tuning (WSI-FT, or SWIFT) features with/without cell-level augmentations and different fine-tuning tricks. c. Staining color changes among different data centers. d. The first column are original images and the remain two columns are color adversarial samples in RGB and HSV space. e. The corresponding staining distributions. f. The performance on domain-shift, the left and the right one are AUC and specificity respectively.
  • Figure 5: WSI diagnosis performance on external test and clinical trial data.a. Our model reaches high sensitivity and specificity in 105 unseen data centers with a total n = 233,509. b. The corresponding confusion matrix of (a). c. The clinical trial significant improvement: assisted with the AI, pathologists can improve both the sensitivity and specificity, thus a higher consistency (Cohen's Kappa = 0.7197) with gold standard (annotated by a more expert pathologists panel). d. The binary confusion matrix of our AI model and it's assistance result to pathologist.
  • ...and 7 more figures