Deep Learning for Pancreas Segmentation: a Systematic Review
Andrea Moglia, Matteo Cavicchioli, Luca Mainardi, Pietro Cerveri
TL;DR
This systematic review assesses a decade of deep learning efforts in CT pancreas segmentation, detailing architectures from UNet variants to transformers and hybrid CNN–Transformer models. It synthesizes findings on parenchyma and lesion segmentation, highlighting two-stage localization strategies, multi-organ approaches, and a wide spectrum of semi-/unsupervised learning techniques, while emphasizing generalization gaps across datasets and centers. The review underscores the critical role of public datasets (e.g., NIH, MSD, AbdomenCT-1k) and standard metrics (DSC, Jaccard, HD/HD95, ASD, NSD) in benchmarking progress, yet also notes substantial challenges in clinical translation, including data diversity, reproducibility, and regulatory hurdles. Overall, while DL models have achieved high accuracy on certain benchmarks, robust generalization, real-time performance, and explainability remain essential for reliable clinical adoption in pancreas imaging and surgical planning.
Abstract
Pancreas segmentation has been traditionally challenging due to its small size in computed tomography abdominal volumes, high variability of shape and positions among patients, and blurred boundaries due to low contrast between the pancreas and surrounding organs. Many deep learning models for pancreas segmentation have been proposed in the past few years. We present a thorough systematic review based on the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement. The literature search was conducted on PubMed, Web of Science, Scopus, and IEEE Xplore on original studies published in peer-reviewed journals from 2013 to 2023. Overall, 130 studies were retrieved. We initially provided an overview of the technical background of the most common network architectures and publicly available datasets. Then, the analysis of the studies combining visual presentation in tabular form and text description was reported. The tables grouped the studies specifying the application, dataset size, design (model architecture, learning strategy, and loss function), results, and main contributions. We first analyzed the studies focusing on parenchyma segmentation using coarse-to-fine approaches, multi-organ segmentation, semi-supervised learning, and unsupervised learning, followed by those studies on generalization to other datasets and those concerning the design of new loss functions. Then, we analyzed the studies on segmentation of tumors, cysts, and inflammation reporting multi-stage methods, semi-supervised learning, generalization to other datasets, and design of new loss functions. Finally, we provided a critical discussion on the subject based on the published evidence underlining current issues that need to be addressed before clinical translation.
