Table of Contents
Fetching ...

PanTS: The Pancreatic Tumor Segmentation Dataset

Wenxuan Li, Xinze Zhou, Qi Chen, Tianyu Lin, Pedro R. A. S. Bassi, Szymon Plotka, Jaroslaw B. Cwikla, Xiaoxi Chen, Chen Ye, Zheren Zhu, Kai Ding, Heng Li, Kang Wang, Yang Yang, Yucheng Tang, Daguang Xu, Alan L. Yuille, Zongwei Zhou

TL;DR

PanTS delivers the largest multi-institutional CT dataset for pancreatic tumor analysis, featuring 36,390 scans from 145 centers with voxel-wise annotations for tumors, pancreas subregions, and 24 surrounding structures. The work demonstrates that both dataset scale and rich anatomical context substantially improve AI performance, particularly under out-of-distribution conditions, and provides a public baseline model and benchmarking protocol. Through rigorous annotation standards and quality control, PanTS enables robust evaluation of anatomy-aware segmentation methods for detection, localization, and surgical planning. This resource has the potential to accelerate clinically translatable AI tools for early pancreatic cancer detection and radiotherapy planning.

Abstract

PanTS is a large-scale, multi-institutional dataset curated to advance research in pancreatic CT analysis. It contains 36,390 CT scans from 145 medical centers, with expert-validated, voxel-wise annotations of over 993,000 anatomical structures, covering pancreatic tumors, pancreas head, body, and tail, and 24 surrounding anatomical structures such as vascular/skeletal structures and abdominal/thoracic organs. Each scan includes metadata such as patient age, sex, diagnosis, contrast phase, in-plane spacing, slice thickness, etc. AI models trained on PanTS achieve significantly better performance in pancreatic tumor detection, localization, and segmentation compared to those trained on existing public datasets. Our analysis indicates that these gains are directly attributable to the 16x larger-scale tumor annotations and indirectly supported by the 24 additional surrounding anatomical structures. As the largest and most comprehensive resource of its kind, PanTS offers a new benchmark for developing and evaluating AI models in pancreatic CT analysis.

PanTS: The Pancreatic Tumor Segmentation Dataset

TL;DR

PanTS delivers the largest multi-institutional CT dataset for pancreatic tumor analysis, featuring 36,390 scans from 145 centers with voxel-wise annotations for tumors, pancreas subregions, and 24 surrounding structures. The work demonstrates that both dataset scale and rich anatomical context substantially improve AI performance, particularly under out-of-distribution conditions, and provides a public baseline model and benchmarking protocol. Through rigorous annotation standards and quality control, PanTS enables robust evaluation of anatomy-aware segmentation methods for detection, localization, and surgical planning. This resource has the potential to accelerate clinically translatable AI tools for early pancreatic cancer detection and radiotherapy planning.

Abstract

PanTS is a large-scale, multi-institutional dataset curated to advance research in pancreatic CT analysis. It contains 36,390 CT scans from 145 medical centers, with expert-validated, voxel-wise annotations of over 993,000 anatomical structures, covering pancreatic tumors, pancreas head, body, and tail, and 24 surrounding anatomical structures such as vascular/skeletal structures and abdominal/thoracic organs. Each scan includes metadata such as patient age, sex, diagnosis, contrast phase, in-plane spacing, slice thickness, etc. AI models trained on PanTS achieve significantly better performance in pancreatic tumor detection, localization, and segmentation compared to those trained on existing public datasets. Our analysis indicates that these gains are directly attributable to the 16x larger-scale tumor annotations and indirectly supported by the 24 additional surrounding anatomical structures. As the largest and most comprehensive resource of its kind, PanTS offers a new benchmark for developing and evaluating AI models in pancreatic CT analysis.

Paper Structure

This paper contains 34 sections, 3 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Dataset characteristics and visualization.A. PanTS comprises 36,390 CT scans collected from 145 medical centers, paired with expert-validated voxel-wise annotations, 16$\times$ larger than the biggest public dataset (i.e., PANORAMA alves2024panorama) to date. B--C. The dataset includes detailed annotations for pancreatic tumors, pancreas, and its head, body, and tail, enabling spatially aware tumor localization. D. Twenty-four surrounding anatomical structures are voxel-wise annotated to provide rich spatial context, including key vessels, ducts, and organs critical for tumor detection, resectability assessment, and radiotherapy planning.
  • Figure 2: Geographic diversity of the PanTS dataset. Global distribution of contributing centers in the PanTS training set (purple circles) and test set (red outlines). Circle size is proportional to the base-10 logarithm (log$_{10}$) of the number of CT scans contributed per country. The training set is aggregated from diverse public datasets spanning multiple countries, while the much larger test set is exclusively drawn from three independent centers---UCSF (United States, North America), PH (Poland, Europe), and PUTH (China, Asia)---not seen during training, enabling rigorous out-of-distribution evaluation.
  • Figure 3: Annotation standard and quality control.A--C. Voxel-wise annotations of pancreatic tumors and surrounding anatomical structures shown on axial, sagittal, and coronal planes. Radiologists provide these annotations following the standard described in §\ref{['sec:annotation_standard']}. D. 3D rendering on the coronal plane highlights detailed annotations of the tumor, pancreas, and key vessels, including the celiac artery (Celiac AA), superior mesenteric artery (SMA), common bile duct (CBD), and surrounding veins. E. To assess annotation quality, a subset of 300 CT scans from the PanTS training set was independently re-annotated by multiple radiologists. Inter-annotator agreement was evaluated using the Dice Similarity Coefficient (DSC).
  • Figure 4: Inter-annotator agreement on the PanTS subset.A. Distribution of DSC (%) values between two independent radiologists across 300 CT scans from the PanTS training set. Most annotations demonstrate high agreement, confirming their reliability. A minimum threshold of DSC = 20% (dashed red line) is used to flag low-agreement cases, which are reviewed by senior radiologists for further quality assurance. B. Representative examples showing the same CT scan annotated by two different radiologists. High-agreement cases appear in the left columns, while low-agreement cases—often involving small or ambiguous lesions—appear on the right.
  • Figure 5: Justification of annotating large-scale tumor datasets.A. The Receiver Operating Characteristic (ROC) curve of standard nnU-Net trained on different scale of pancreatic CT datasets, i.e., MSD-Pancreas ($n$ = 281), PANORAMA ($n$ = 2,238), and our PanTS dataset ($n$ = 9,901). The performance is tested on the PanTS test dataset (CT collected different centers from MSD-Pancreas, PANORAMA, and the PanTS training set, detailed in Figure \ref{['fig:center_distribution']}). The observation is the larger training set, the better pancreatic tumor detection performance on the out-of-distribution test set. B. Barplot of AI trained on our PanTS vs. AI trained on publicly available dataset (MSD-Pancreas). The performance is tested on the official MSD-Pancreas test set (third-party evaluation). All metrics can be found at https://decathlon-10.grand-challenge.org/evaluation/challenge/leaderboard/.
  • ...and 1 more figures