Table of Contents
Fetching ...

HQColon: A Hybrid Interactive Machine Learning Pipeline for High Quality Colon Labeling and Segmentation

Martina Finocchiaro, Ronja Stern, Abraham George Smith, Jens Petersen, Kenny Erleben, Melanie Ganz

TL;DR

HQColon presents a fully automatic, high-resolution colon segmentation method for CT colonography to support digital twins and AI-driven diagnostics. It combines a semi-automatic, expert-validated labeling pipeline with an interactive ML step for fluid pockets and trains four 3D nnU-Net models on raw and masked inputs to segment both air-filled and full colon. Compared with the open-source TotalSegmentator, HQColon delivers substantially higher boundary accuracy (HD95) and surface distance (ASSD), and captures challenging features such as fluid pockets and haustral folds, with a typical inference time around 69 seconds on a high-end GPU. The work provides open-source code and a large, publicly available annotated dataset, reducing labeling effort and enabling broad adoption in research and clinical workflows.

Abstract

High-resolution colon segmentation is crucial for clinical and research applications, such as digital twins and personalized medicine. However, the leading open-source abdominal segmentation tool, TotalSegmentator, struggles with accuracy for the colon, which has a complex and variable shape, requiring time-intensive labeling. Here, we present the first fully automatic high-resolution colon segmentation method. To develop it, we first created a high resolution colon dataset using a pipeline that combines region growing with interactive machine learning to efficiently and accurately label the colon on CT colonography (CTC) images. Based on the generated dataset consisting of 435 labeled CTC images we trained an nnU-Net model for fully automatic colon segmentation. Our fully automatic model achieved an average symmetric surface distance of 0.2 mm (vs. 4.0 mm from TotalSegmentator) and a 95th percentile Hausdorff distance of 1.0 mm (vs. 18 mm from TotalSegmentator). Our segmentation accuracy substantially surpasses TotalSegmentator. We share our trained model and pipeline code, providing the first and only open-source tool for high-resolution colon segmentation. Additionally, we created a large-scale dataset of publicly available high-resolution colon labels.

HQColon: A Hybrid Interactive Machine Learning Pipeline for High Quality Colon Labeling and Segmentation

TL;DR

HQColon presents a fully automatic, high-resolution colon segmentation method for CT colonography to support digital twins and AI-driven diagnostics. It combines a semi-automatic, expert-validated labeling pipeline with an interactive ML step for fluid pockets and trains four 3D nnU-Net models on raw and masked inputs to segment both air-filled and full colon. Compared with the open-source TotalSegmentator, HQColon delivers substantially higher boundary accuracy (HD95) and surface distance (ASSD), and captures challenging features such as fluid pockets and haustral folds, with a typical inference time around 69 seconds on a high-end GPU. The work provides open-source code and a large, publicly available annotated dataset, reducing labeling effort and enabling broad adoption in research and clinical workflows.

Abstract

High-resolution colon segmentation is crucial for clinical and research applications, such as digital twins and personalized medicine. However, the leading open-source abdominal segmentation tool, TotalSegmentator, struggles with accuracy for the colon, which has a complex and variable shape, requiring time-intensive labeling. Here, we present the first fully automatic high-resolution colon segmentation method. To develop it, we first created a high resolution colon dataset using a pipeline that combines region growing with interactive machine learning to efficiently and accurately label the colon on CT colonography (CTC) images. Based on the generated dataset consisting of 435 labeled CTC images we trained an nnU-Net model for fully automatic colon segmentation. Our fully automatic model achieved an average symmetric surface distance of 0.2 mm (vs. 4.0 mm from TotalSegmentator) and a 95th percentile Hausdorff distance of 1.0 mm (vs. 18 mm from TotalSegmentator). Our segmentation accuracy substantially surpasses TotalSegmentator. We share our trained model and pipeline code, providing the first and only open-source tool for high-resolution colon segmentation. Additionally, we created a large-scale dataset of publicly available high-resolution colon labels.

Paper Structure

This paper contains 11 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Example of axial, sagittal, and coronal CT colonography slices (top) with corresponding air- and fluid-filled colon annotations (bottom). On the right, 3D reconstructions show the colon alone (bottom) and with the small bowel (top).
  • Figure 2: Two steps pipeline: 1) generation of high-resolution colon-labeled dataset 2) training and testing for fully automated colon segmentation.
  • Figure 3: Examples of fluid visualization on axial slices: (A) supine and (B) prone patient. C, D show RootPainter fluid segmentation.
  • Figure 4: Dataflow for creating the high-resolution annotated dataset to train and test the fully automatic model for colon segmentation. The gray box indicates reasons for scans exclusion. The plots on the left display the gender and age distribution in the final annotated dataset.
  • Figure 5: Dice as function of number of annotated images and annotation time.
  • ...and 2 more figures