Table of Contents
Fetching ...

Towards a Comprehensive Benchmark for Pathological Lymph Node Metastasis in Breast Cancer Sections

Xitong Ling, Yuanyuan Lei, Jiawen Li, Junru Cheng, Wenting Huang, Tian Guan, Jian Guan, Yonghong He

TL;DR

This study reprocessed 1,399 WSIs and labels from the Camelyon-16 and Camelyon-17 datasets, removing low-quality slides, correcting erroneous labels, and providing expert pixel annotations for tumor regions in the previously unreleased test set, providing a benchmark that advances AI development in histopathology.

Abstract

Advances in optical microscopy scanning have significantly contributed to computational pathology (CPath) by converting traditional histopathological slides into whole slide images (WSIs). This development enables comprehensive digital reviews by pathologists and accelerates AI-driven diagnostic support for WSI analysis. Recent advances in foundational pathology models have increased the need for benchmarking tasks. The Camelyon series is one of the most widely used open-source datasets in computational pathology. However, the quality, accessibility, and clinical relevance of the labels have not been comprehensively evaluated. In this study, we reprocessed 1,399 WSIs and labels from the Camelyon-16 and Camelyon-17 datasets, removing low-quality slides, correcting erroneous labels, and providing expert pixel annotations for tumor regions in the previously unreleased test set. Based on the sizes of re-annotated tumor regions, we upgraded the binary cancer screening task to a four-class task: negative, micro-metastasis, macro-metastasis, and Isolated Tumor Cells (ITC). We reevaluated pre-trained pathology feature extractors and multiple instance learning (MIL) methods using the cleaned dataset, providing a benchmark that advances AI development in histopathology.

Towards a Comprehensive Benchmark for Pathological Lymph Node Metastasis in Breast Cancer Sections

TL;DR

This study reprocessed 1,399 WSIs and labels from the Camelyon-16 and Camelyon-17 datasets, removing low-quality slides, correcting erroneous labels, and providing expert pixel annotations for tumor regions in the previously unreleased test set, providing a benchmark that advances AI development in histopathology.

Abstract

Advances in optical microscopy scanning have significantly contributed to computational pathology (CPath) by converting traditional histopathological slides into whole slide images (WSIs). This development enables comprehensive digital reviews by pathologists and accelerates AI-driven diagnostic support for WSI analysis. Recent advances in foundational pathology models have increased the need for benchmarking tasks. The Camelyon series is one of the most widely used open-source datasets in computational pathology. However, the quality, accessibility, and clinical relevance of the labels have not been comprehensively evaluated. In this study, we reprocessed 1,399 WSIs and labels from the Camelyon-16 and Camelyon-17 datasets, removing low-quality slides, correcting erroneous labels, and providing expert pixel annotations for tumor regions in the previously unreleased test set. Based on the sizes of re-annotated tumor regions, we upgraded the binary cancer screening task to a four-class task: negative, micro-metastasis, macro-metastasis, and Isolated Tumor Cells (ITC). We reevaluated pre-trained pathology feature extractors and multiple instance learning (MIL) methods using the cleaned dataset, providing a benchmark that advances AI development in histopathology.

Paper Structure

This paper contains 17 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Issues within the Camelyon-16 bejnordi2017diagnostic and Camelyon-17 bandi2018detection datasets: therapeutic response, annotation omissions, blurred boundaries, and poor staining.
  • Figure 2: Revised data distribution of the Camelyon dataset and the pathological characteristics of different categories.
  • Figure 3: AUC and F1-score Comparison of Different Methods in the Camelyon-17-Origin and Camelyon-17-Refine Comparative Experiments.
  • Figure 4: Model ranking radar chart based on AUC and F1-score on Camelyon-17-Origin and Camelyon-17-Refine datasets.
  • Figure 5: Distribution of AUC and F1-score in benchmark results across different feature extractors and aggregators.
  • ...and 1 more figures