Table of Contents
Fetching ...

RRTS Dataset: A Benchmark Colonoscopy Dataset from Resource-Limited Settings for Computer-Aided Diagnosis Research

Ridoy Chandra Shil, Ragib Abid, Tasnia Binte Mamun, Samiul Based Shuvo, Masfique Ahmed Bhuiyan, Jahid Ferdous

TL;DR

This work introduces the BUET Polyp Dataset (BPD), a large, real-world colonoscopy image collection from a resource-constrained hospital in Bangladesh, with polyp and non-polyp frames and expert-verified segmentation masks. By benchmarking segmentation with UNet-based backbones (including InceptionV4-UNet) and classification with transfer-learned CNNs (VGG16, ResNet50, InceptionV3), the study demonstrates that segmentation under real-world artifacts remains markedly harder than on curated datasets, achieving a peak Dice score of about 0.64, while classification achieves an accuracy up to 0.91. The dataset emphasizes practical challenges such as glare, motion blur, stool, and uneven illumination, highlighting the need for preprocessing, domain adaptation, and possibly a tiered CAD workflow in low-resource settings. Overall, BPD provides a valuable resource to stress-test CAD systems, promote robustness to artifacts, and guide future work toward multi-center data, temporal analysis, and lightweight architectures for broader clinical impact.

Abstract

Background and Objective: Colorectal cancer prevention relies on early detection of polyps during colonoscopy. Existing public datasets, such as CVC-ClinicDB and Kvasir-SEG, provide valuable benchmarks but are limited by small sample sizes, curated image selection, or lack of real-world artifacts. There remains a need for datasets that capture the complexity of clinical practice, particularly in resource-constrained settings. Methods: We introduce a dataset, BUET Polyp Dataset (BPD), of colonoscopy images collected using Olympus 170 and Pentax i-Scan series endoscopes under routine clinical conditions. The dataset contains images with corresponding expert-annotated binary masks, reflecting diverse challenges such as motion blur, specular highlights, stool artifacts, blood, and low-light frames. Annotations were manually reviewed by clinical experts to ensure quality. To demonstrate baseline performance, we provide benchmark results for classification using VGG16, ResNet50, and InceptionV3, and for segmentation using UNet variants with VGG16, ResNet34, and InceptionV4 backbones. Results: The dataset comprises 1,288 images with polyps from 164 patients with corresponding ground-truth masks and 1,657 polyp-free images from 31 patients. Benchmarking experiments achieved up to 90.8% accuracy for binary classification (VGG16) and a maximum Dice score of 0.64 with InceptionV4-UNet for segmentation. Performance was lower compared to curated datasets, reflecting the real-world difficulty of images with artifacts and variable quality.

RRTS Dataset: A Benchmark Colonoscopy Dataset from Resource-Limited Settings for Computer-Aided Diagnosis Research

TL;DR

This work introduces the BUET Polyp Dataset (BPD), a large, real-world colonoscopy image collection from a resource-constrained hospital in Bangladesh, with polyp and non-polyp frames and expert-verified segmentation masks. By benchmarking segmentation with UNet-based backbones (including InceptionV4-UNet) and classification with transfer-learned CNNs (VGG16, ResNet50, InceptionV3), the study demonstrates that segmentation under real-world artifacts remains markedly harder than on curated datasets, achieving a peak Dice score of about 0.64, while classification achieves an accuracy up to 0.91. The dataset emphasizes practical challenges such as glare, motion blur, stool, and uneven illumination, highlighting the need for preprocessing, domain adaptation, and possibly a tiered CAD workflow in low-resource settings. Overall, BPD provides a valuable resource to stress-test CAD systems, promote robustness to artifacts, and guide future work toward multi-center data, temporal analysis, and lightweight architectures for broader clinical impact.

Abstract

Background and Objective: Colorectal cancer prevention relies on early detection of polyps during colonoscopy. Existing public datasets, such as CVC-ClinicDB and Kvasir-SEG, provide valuable benchmarks but are limited by small sample sizes, curated image selection, or lack of real-world artifacts. There remains a need for datasets that capture the complexity of clinical practice, particularly in resource-constrained settings. Methods: We introduce a dataset, BUET Polyp Dataset (BPD), of colonoscopy images collected using Olympus 170 and Pentax i-Scan series endoscopes under routine clinical conditions. The dataset contains images with corresponding expert-annotated binary masks, reflecting diverse challenges such as motion blur, specular highlights, stool artifacts, blood, and low-light frames. Annotations were manually reviewed by clinical experts to ensure quality. To demonstrate baseline performance, we provide benchmark results for classification using VGG16, ResNet50, and InceptionV3, and for segmentation using UNet variants with VGG16, ResNet34, and InceptionV4 backbones. Results: The dataset comprises 1,288 images with polyps from 164 patients with corresponding ground-truth masks and 1,657 polyp-free images from 31 patients. Benchmarking experiments achieved up to 90.8% accuracy for binary classification (VGG16) and a maximum Dice score of 0.64 with InceptionV4-UNet for segmentation. Performance was lower compared to curated datasets, reflecting the real-world difficulty of images with artifacts and variable quality.

Paper Structure

This paper contains 34 sections, 6 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Examples of colorectal polyp morphologies observed during colonoscopy. (a) Sessile polyp lying flat against the colonic mucosa, making detection more challenging.(b, c) Pedunculated polyps with stalk-like structures protruding from the mucosal surface.
  • Figure 2: Progression of colorectal cancer from benign adenomatous polyps to malignant invasive cancerhealthline_polyp_size.
  • Figure 3: t-SNE visualization of ResNet18 feature embeddings for polyp and non-polyp images. Polyp samples form several compact clusters, while non-polyp samples are more diffusely distributed, with notable overlap between the two classes.
  • Figure 4: Qualitative examples of segmentation performance of InceptionV4-UNet model. The top row shows a case of good performance where the predicted mask closely matches the ground truth. The bottom row shows a case of poor performance, where the prediction fails to capture the polyp boundaries accurately.
  • Figure 5: Training and validation loss curves of InceptionV4-UNet during segmentation.
  • ...and 1 more figures