Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

Debesh Jha; Vanshali Sharma; Debapriya Banik; Debayan Bhattacharya; Kaushiki Roy; Steven A. Hicks; Nikhil Kumar Tomar; Vajira Thambawita; Adrian Krenzer; Ge-Peng Ji; Sahadev Poudel; George Batchkala; Saruar Alam; Awadelrahman M. A. Ahmed; Quoc-Huy Trinh; Zeshan Khan; Tien-Phat Nguyen; Shruti Shrestha; Sabari Nathan; Jeonghwan Gwak; Ritika K. Jha; Zheyuan Zhang; Alexander Schlaefer; Debotosh Bhattacharjee; M. K. Bhuyan; Pradip K. Das; Deng-Ping Fan; Sravanthi Parsa; Sharib Ali; Michael A. Riegler; Pål Halvorsen; Thomas De Lange; Ulas Bagci

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

Debesh Jha, Vanshali Sharma, Debapriya Banik, Debayan Bhattacharya, Kaushiki Roy, Steven A. Hicks, Nikhil Kumar Tomar, Vajira Thambawita, Adrian Krenzer, Ge-Peng Ji, Sahadev Poudel, George Batchkala, Saruar Alam, Awadelrahman M. A. Ahmed, Quoc-Huy Trinh, Zeshan Khan, Tien-Phat Nguyen, Shruti Shrestha, Sabari Nathan, Jeonghwan Gwak, Ritika K. Jha, Zheyuan Zhang, Alexander Schlaefer, Debotosh Bhattacharjee, M. K. Bhuyan, Pradip K. Das, Deng-Ping Fan, Sravanthi Parsa, Sharib Ali, Michael A. Riegler, Pål Halvorsen, Thomas De Lange, Ulas Bagci

TL;DR

This work analyzes two major competitions, Medico 2020 and MedAI 2021, that benchmark automatic polyp and instrument segmentation in colonoscopy and foreground transparency in medical image analysis. It documents datasets derived from publicly available sources (e.g., Kvasir-SEG, Kvasir-Instrument), evaluation metrics ($mIoU$, $DSC$, $FPS$) and task-specific criteria, and compares leading architectures (encoder–decoder U-Nets, PraNet, EfficientUNet, DPRA-EdgeNet) and ensemble strategies. A key contribution is the explicit focus on transparency, with interdisciplinary expert reviews of open-source practices, failure analyses, and explanations (heatmaps, LRP) to assess model credibility for clinical deployment. The analysis identifies robust performance gains in MedAI 2021, notes persistent challenges on diminutive/flat polyps and occlusions, and advocates for multi-center, video-sequence benchmarks and stronger emphasis on explainability to bridge toward real-world adoption.

Abstract

Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has emerged as a promising solution to this challenge as it can assist endoscopists in detecting and classifying overlooked polyps and abnormalities in real time. In addition to the algorithm's accuracy, transparency and interpretability are crucial to explaining the whys and hows of the algorithm's prediction. Further, most algorithms are developed in private data, closed source, or proprietary software, and methods lack reproducibility. Therefore, to promote the development of efficient and transparent methods, we have organized the "Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image Segmentation (MedAI 2021)" competitions. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic. For the transparency task, a multi-disciplinary team, including expert gastroenterologists, accessed each submission and evaluated the team based on open-source practices, failure case analysis, ablation studies, usability and understandability of evaluations to gain a deeper understanding of the models' credibility for clinical deployment. Through the comprehensive analysis of the challenge, we not only highlight the advancements in polyp and surgical instrument segmentation but also encourage qualitative evaluation for building more transparent and understandable AI-based colonoscopy systems.

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

TL;DR

) and task-specific criteria, and compares leading architectures (encoder–decoder U-Nets, PraNet, EfficientUNet, DPRA-EdgeNet) and ensemble strategies. A key contribution is the explicit focus on transparency, with interdisciplinary expert reviews of open-source practices, failure analyses, and explanations (heatmaps, LRP) to assess model credibility for clinical deployment. The analysis identifies robust performance gains in MedAI 2021, notes persistent challenges on diminutive/flat polyps and occlusions, and advocates for multi-center, video-sequence benchmarks and stronger emphasis on explainability to bridge toward real-world adoption.

Abstract

Paper Structure (30 sections, 12 figures, 8 tables)

This paper contains 30 sections, 12 figures, 8 tables.

Introduction
Challenge description
Medico 2020 Automatic Polyp Segmentation Challenge
MedAI: Transparency in Medical Image Segmentation Challenge
Related Work
Challenge datasets and evaluation metrics
Medico 2020 dataset
MedAI Transparency challenge 2021 dataset
Metrics for polyp and instrument segmentation tasks
Metrics for efficiency tasks
Metrics for transparency tasks
Participating Research Teams
Methods used in Medico 2020
Methods used in MedAI 2021
Results
...and 15 more sections

Figures (12)

Figure 1: The overview of the "Medico 2020 Polyp" and "MedAI 2021 Transparency " challenges. We describe each task along with the number of training and testing datasets and the evaluation metrics used in the tasks.
Figure 2: Example of the test datasets from the Medico 2020 and MedAI 2021 datasets.
Figure 3: Data distribution details of train and test sets used in Medico 2020 and MedAI 2021 challenges. Large, medium, and small represent the distribution information of regions of interest in the data samples.
Figure 4: Summary of the participating teams algorithm for Medico 2020. Here, "Aug." = augmentation used, "SGD" = Stochastic gradient descent, "GAN" = generative adversarial network, "ASPP" = Atrous Spatial Pyramid Pooling, and "AP" = Average precision.
Figure 5: Overview of the winning solution for the Polyp segmentation task (Task 1) from Team PRML2020GU. The architecture utilizes pre-trained weights from EfficientNet in the encoder. Additionally, it uses dense skip connections, deep supervision and channel-spatial attention for fast convergence and better performance.
...and 7 more figures

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

TL;DR

Abstract

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

Authors

TL;DR

Abstract

Table of Contents

Figures (12)