Table of Contents
Fetching ...

Benchmarking Multi-Organ Segmentation Tools for Multi-Parametric T1-weighted Abdominal MRI

Nicole Tran, Anisa Prasad, Yan Zhuang, Tejas Sudharshan Mathai, Boah Kim, Sydney Lewis, Pritam Mukherjee, Jianfei Liu, Ronald M. Summers

TL;DR

This work tackles the problem of robust multi-organ segmentation in multi-parametric abdominal MRI by benchmarking three public tools (MRSegmentator, TotalSegmentator MRI, TotalVibeSegmentator) on a curated 40-volume T1-weighted dataset derived from the Duke Liver Dataset. The authors quantify performance across four sequence types (pre-contrast, arterial, venous, and delayed) using 10 labeled abdominal structures and two metrics, Dice similarity and Hausdorff Distance, with statistical tests to assess cross-tool differences. MRSegmentator achieves the best overall performance, with a Dice of $80.7 \pm 18.6$ and an HD of $8.9 \pm 10.4$ mm ($p<0.001$ vs. the others), and shows consistent superiority across large, medium, and small organs. The results highlight the influence of training data (MRI vs CT mixtures) on generalization to abdominal MRI and inform tool selection and dataset design for reliable abdominal organ segmentation in clinical workflows.

Abstract

The segmentation of multiple organs in multi-parametric MRI studies is critical for many applications in radiology, such as correlating imaging biomarkers with disease status (e.g., cirrhosis, diabetes). Recently, three publicly available tools, such as MRSegmentator (MRSeg), TotalSegmentator MRI (TS), and TotalVibeSegmentator (VIBE), have been proposed for multi-organ segmentation in MRI. However, the performance of these tools on specific MRI sequence types has not yet been quantified. In this work, a subset of 40 volumes from the public Duke Liver Dataset was curated. The curated dataset contained 10 volumes each from the pre-contrast fat saturated T1, arterial T1w, venous T1w, and delayed T1w phases, respectively. Ten abdominal structures were manually annotated in these volumes. Next, the performance of the three public tools was benchmarked on this curated dataset. The results indicated that MRSeg obtained a Dice score of 80.7 $\pm$ 18.6 and Hausdorff Distance (HD) error of 8.9 $\pm$ 10.4 mm. It fared the best ($p < .05$) across the different sequence types in contrast to TS and VIBE.

Benchmarking Multi-Organ Segmentation Tools for Multi-Parametric T1-weighted Abdominal MRI

TL;DR

This work tackles the problem of robust multi-organ segmentation in multi-parametric abdominal MRI by benchmarking three public tools (MRSegmentator, TotalSegmentator MRI, TotalVibeSegmentator) on a curated 40-volume T1-weighted dataset derived from the Duke Liver Dataset. The authors quantify performance across four sequence types (pre-contrast, arterial, venous, and delayed) using 10 labeled abdominal structures and two metrics, Dice similarity and Hausdorff Distance, with statistical tests to assess cross-tool differences. MRSegmentator achieves the best overall performance, with a Dice of and an HD of mm ( vs. the others), and shows consistent superiority across large, medium, and small organs. The results highlight the influence of training data (MRI vs CT mixtures) on generalization to abdominal MRI and inform tool selection and dataset design for reliable abdominal organ segmentation in clinical workflows.

Abstract

The segmentation of multiple organs in multi-parametric MRI studies is critical for many applications in radiology, such as correlating imaging biomarkers with disease status (e.g., cirrhosis, diabetes). Recently, three publicly available tools, such as MRSegmentator (MRSeg), TotalSegmentator MRI (TS), and TotalVibeSegmentator (VIBE), have been proposed for multi-organ segmentation in MRI. However, the performance of these tools on specific MRI sequence types has not yet been quantified. In this work, a subset of 40 volumes from the public Duke Liver Dataset was curated. The curated dataset contained 10 volumes each from the pre-contrast fat saturated T1, arterial T1w, venous T1w, and delayed T1w phases, respectively. Ten abdominal structures were manually annotated in these volumes. Next, the performance of the three public tools was benchmarked on this curated dataset. The results indicated that MRSeg obtained a Dice score of 80.7 18.6 and Hausdorff Distance (HD) error of 8.9 10.4 mm. It fared the best () across the different sequence types in contrast to TS and VIBE.

Paper Structure

This paper contains 8 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: We curated a subset of the Duke Liver dataset consisting of 40 volumes, 10 each from pre-contrast T1, arterial T1w, venous T1w, and delayed T1w series. 10 common abdominal organs (bottom right) were manually segmented in these volumes and verified by a senior board-certified radiologist. Examples of the manual segmentations for these structures at different slices (from superior to inferior) in one scan are shown.
  • Figure 2: Comparison of the DSC and Hausdorff Distance (HD) errors across all 10 structures in 40 volumes for the different multi-organ MRI segmenters.
  • Figure 3: Comparison of multi-organ segmentations by TS, VIBE, and MRSeg for four different patients containing various disease conditions. Case 1 shows a normal patient with no disease. Case 2 shows a patient with liver cirrhosis. Note the over-segmentation of the liver into adjacent ascites (fluid region, red arrows). Case 3 shows a patient with multiple splenic lesions (red arrows). Case 4 shows a patient with a lesion in the left kidney.
  • Figure 4: Box plot comparing DSC of large abdominal organs (spleen, stomach, liver)
  • Figure 5: Box plot comparing Hausdorff distances in mm of large abdominal organs (spleen, stomach, liver)
  • ...and 5 more figures