Table of Contents
Fetching ...

Generalizable Cervical Cancer Screening via Large-scale Pretraining and Test-Time Adaptation

Hao Jiang, Cheng Jin, Huangjing Lin, Yanning Zhou, Xi Wang, Jiabo Ma, Li Ding, Jun Hou, Runsheng Liu, Zhizhong Chai, Luyang Luo, Huijuan Shi, Yinling Qian, Qiong Wang, Changzhong Li, Anjia Han, Ronald Cheong Kin Chan, Hao Chen

TL;DR

Smart-CCS, a generalizable Cervical Cancer Screening paradigm based on pretraining and adaptation to create robust and generalizable screening systems, demonstrated superior sensitivity in diagnosing cervical cancer, confirming the accuracy of the cancer screening results by using histology findings for validation.

Abstract

Cervical cancer is a leading malignancy in female reproductive system. While AI-assisted cytology offers a cost-effective and non-invasive screening solution, current systems struggle with generalizability in complex clinical scenarios. To address this issue, we introduced Smart-CCS, a generalizable Cervical Cancer Screening paradigm based on pretraining and adaptation to create robust and generalizable screening systems. To develop and validate Smart-CCS, we first curated a large-scale, multi-center dataset named CCS-127K, which comprises a total of 127,471 cervical cytology whole-slide images collected from 48 medical centers. By leveraging large-scale self-supervised pretraining, our CCS models are equipped with strong generalization capability, potentially generalizing across diverse scenarios. Then, we incorporated test-time adaptation to specifically optimize the trained CCS model for complex clinical settings, which adapts and refines predictions, improving real-world applicability. We conducted large-scale system evaluation among various cohorts. In retrospective cohorts, Smart-CCS achieved an overall area under the curve (AUC) value of 0.965 and sensitivity of 0.913 for cancer screening on 11 internal test datasets. In external testing, system performance maintained high at 0.950 AUC across 6 independent test datasets. In prospective cohorts, our Smart-CCS achieved AUCs of 0.947, 0.924, and 0.986 in three prospective centers, respectively. Moreover, the system demonstrated superior sensitivity in diagnosing cervical cancer, confirming the accuracy of our cancer screening results by using histology findings for validation. Interpretability analysis with cell and slide predictions further indicated that the system's decision-making aligns with clinical practice. Smart-CCS represents a significant advancement in cancer screening across diverse clinical contexts.

Generalizable Cervical Cancer Screening via Large-scale Pretraining and Test-Time Adaptation

TL;DR

Smart-CCS, a generalizable Cervical Cancer Screening paradigm based on pretraining and adaptation to create robust and generalizable screening systems, demonstrated superior sensitivity in diagnosing cervical cancer, confirming the accuracy of the cancer screening results by using histology findings for validation.

Abstract

Cervical cancer is a leading malignancy in female reproductive system. While AI-assisted cytology offers a cost-effective and non-invasive screening solution, current systems struggle with generalizability in complex clinical scenarios. To address this issue, we introduced Smart-CCS, a generalizable Cervical Cancer Screening paradigm based on pretraining and adaptation to create robust and generalizable screening systems. To develop and validate Smart-CCS, we first curated a large-scale, multi-center dataset named CCS-127K, which comprises a total of 127,471 cervical cytology whole-slide images collected from 48 medical centers. By leveraging large-scale self-supervised pretraining, our CCS models are equipped with strong generalization capability, potentially generalizing across diverse scenarios. Then, we incorporated test-time adaptation to specifically optimize the trained CCS model for complex clinical settings, which adapts and refines predictions, improving real-world applicability. We conducted large-scale system evaluation among various cohorts. In retrospective cohorts, Smart-CCS achieved an overall area under the curve (AUC) value of 0.965 and sensitivity of 0.913 for cancer screening on 11 internal test datasets. In external testing, system performance maintained high at 0.950 AUC across 6 independent test datasets. In prospective cohorts, our Smart-CCS achieved AUCs of 0.947, 0.924, and 0.986 in three prospective centers, respectively. Moreover, the system demonstrated superior sensitivity in diagnosing cervical cancer, confirming the accuracy of our cancer screening results by using histology findings for validation. Interpretability analysis with cell and slide predictions further indicated that the system's decision-making aligns with clinical practice. Smart-CCS represents a significant advancement in cancer screening across diverse clinical contexts.

Paper Structure

This paper contains 29 sections, 3 equations, 11 figures, 16 tables.

Figures (11)

  • Figure 1: Overview of CCS-127K data and annotations in this study.a. Class-wise distribution of 127,471 WSIs collected from 48 medical centers, including 124118 samples from 45 retrospective centers and 3,353 samples from 3 prospective centers. Note: RCM represents the centers merged due to limited sample size for each center. b. Abnormal cell annotation statistics: 104,979 abnormal lesion cells were annotated into 6 categories, ASC-US, LSIL, ASC-H, HSIL, SCC, and AGC, termed as CCS-Cell dataset. c. Designed flowchart of the study for the development and validation of the proposed Smart-CCS system. The orange flow represents retrospective studies, while the blue flow represents prospective studies.
  • Figure 1: Overview of cervical cancer screening and computational cytology.a. Illustration of cervical cells infected with human papillomavirus (HPV), leading to cervical cancer. The cytology sample collection involves sampling, centrifugation, staining, imaging, for cytologist examinations with screening reports. b. Key challenges in CCS include cytomorphology similarity, sparse abnormal cell distribution, identifying abnormal cells in gigapixel-sized whole slide images (WSI), and data variability. c. A general AI-assisted cancer screening pipeline, comprising a cell detector and a slide classifier, provides quantitative and visualized predictions for both cell-level and slide-level screening.
  • Figure 2: Overview of the Smart-CCS Paradigm. The Smart-CCS paradigm consists of three sequential stages. a. the pretraining stage, which involves large-scale self-supervised pretraining on diverse cytology images from various centers to build a generalizable feature extraction model. b. the finetuning stage, which specializes the pretrained model for cancer screening tasks, including two components: an abnormal cell detector for identifying abnormal cells and a WSI classifier for slide-level predictions. c. the adaptation stage, which further optimizes trained model for diverse clinical settings via adapting and refining predictions.
  • Figure 2: The conceptual illustration of proposed Smart-CCS paradigm. It consists of three sequential stages: 1) large-scale self-supervised pretraining, 2) CCS model finetuning, and 3) test-time adaptation.
  • Figure 3: Performance of Smart-CCS in retrospective study.a. Evaluation of cell-level cytology task using cell classification datasets, SIPaKMeD ($N$ = 4,049), HErlev ($N$ = 918), and CCS-Cell ($N$ = 9,008). b. Evaluation of the WSI-level cytology task using retrospective cervical cytology datasets ($N$ = 5,189, 11,986, 8,396) to assess cancer screening (ECA) and fine-grained classification (ALL) performances. c. Comparison of abnormal cell detection performance among DDETR, DETR, RetinaNet, Faster R-CNN and YOLOv3 on CCS-Cell dataset. d. The external testing performances are evaluated by metric AUC with different settings, Base denotes the typical two-step CCS model, w/ P is introducing pretraining, w/ P&A refers to our proposed Smart-CCS with pretraining and adaptation. e. Internal and external data distribution, along with the results of cervical cancer screening evaluations.
  • ...and 6 more figures