Table of Contents
Fetching ...

Understanding Benefits and Pitfalls of Current Methods for the Segmentation of Undersampled MRI Data

Jan Nikolas Morshuis, Matthias Hein, Christian F. Baumgartner

TL;DR

The paper benchmarks segmentation performance for undersampled MRI using seven approaches split between two-stage (reconstruction then segmentation) and one-stage (joint reconstruction-segmentation) strategies across two multi-coil knee datasets. It demonstrates that data-consistency in reconstruction is a key driver of segmentation quality, while many complex joint methods offer limited gains over simple baselines. The study highlights that high-fidelity reconstructions (as measured by PSNR/SSIM) do not necessarily translate to better segmentations, and that two-stage methods enforcing k-space consistency often perform best. The findings support a practical workflow separating reconstruction from segmentation when segmenting undersampled MRI data and call for standardized, broad benchmarks across datasets and acceleration factors. The work provides actionable insights for choosing segmentation pipelines in accelerated MRI and lays groundwork for future cross-dataset comparisons.

Abstract

MR imaging is a valuable diagnostic tool allowing to non-invasively visualize patient anatomy and pathology with high soft-tissue contrast. However, MRI acquisition is typically time-consuming, leading to patient discomfort and increased costs to the healthcare system. Recent years have seen substantial research effort into the development of methods that allow for accelerated MRI acquisition while still obtaining a reconstruction that appears similar to the fully-sampled MR image. However, for many applications a perfectly reconstructed MR image may not be necessary, particularly, when the primary goal is a downstream task such as segmentation. This has led to growing interest in methods that aim to perform segmentation directly on accelerated MRI data. Despite recent advances, existing methods have largely been developed in isolation, without direct comparison to one another, often using separate or private datasets, and lacking unified evaluation standards. To date, no high-quality, comprehensive comparison of these methods exists, and the optimal strategy for segmenting accelerated MR data remains unknown. This paper provides the first unified benchmark for the segmentation of undersampled MRI data comparing 7 approaches. A particular focus is placed on comparing \textit{one-stage approaches}, that combine reconstruction and segmentation into a unified model, with \textit{two-stage approaches}, that utilize established MRI reconstruction methods followed by a segmentation network. We test these methods on two MRI datasets that include multi-coil k-space data as well as a human-annotated segmentation ground-truth. We find that simple two-stage methods that consider data-consistency lead to the best segmentation scores, surpassing complex specialized methods that are developed specifically for this task.

Understanding Benefits and Pitfalls of Current Methods for the Segmentation of Undersampled MRI Data

TL;DR

The paper benchmarks segmentation performance for undersampled MRI using seven approaches split between two-stage (reconstruction then segmentation) and one-stage (joint reconstruction-segmentation) strategies across two multi-coil knee datasets. It demonstrates that data-consistency in reconstruction is a key driver of segmentation quality, while many complex joint methods offer limited gains over simple baselines. The study highlights that high-fidelity reconstructions (as measured by PSNR/SSIM) do not necessarily translate to better segmentations, and that two-stage methods enforcing k-space consistency often perform best. The findings support a practical workflow separating reconstruction from segmentation when segmenting undersampled MRI data and call for standardized, broad benchmarks across datasets and acceleration factors. The work provides actionable insights for choosing segmentation pipelines in accelerated MRI and lays groundwork for future cross-dataset comparisons.

Abstract

MR imaging is a valuable diagnostic tool allowing to non-invasively visualize patient anatomy and pathology with high soft-tissue contrast. However, MRI acquisition is typically time-consuming, leading to patient discomfort and increased costs to the healthcare system. Recent years have seen substantial research effort into the development of methods that allow for accelerated MRI acquisition while still obtaining a reconstruction that appears similar to the fully-sampled MR image. However, for many applications a perfectly reconstructed MR image may not be necessary, particularly, when the primary goal is a downstream task such as segmentation. This has led to growing interest in methods that aim to perform segmentation directly on accelerated MRI data. Despite recent advances, existing methods have largely been developed in isolation, without direct comparison to one another, often using separate or private datasets, and lacking unified evaluation standards. To date, no high-quality, comprehensive comparison of these methods exists, and the optimal strategy for segmenting accelerated MR data remains unknown. This paper provides the first unified benchmark for the segmentation of undersampled MRI data comparing 7 approaches. A particular focus is placed on comparing \textit{one-stage approaches}, that combine reconstruction and segmentation into a unified model, with \textit{two-stage approaches}, that utilize established MRI reconstruction methods followed by a segmentation network. We test these methods on two MRI datasets that include multi-coil k-space data as well as a human-annotated segmentation ground-truth. We find that simple two-stage methods that consider data-consistency lead to the best segmentation scores, surpassing complex specialized methods that are developed specifically for this task.

Paper Structure

This paper contains 15 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: We separate modern methods that are optimized to segment undersampled MRI data into two categories: Two-Stage Approaches and One-Stage Approaches. Two-stage approaches first reconstruct the undersampled MR image and a simple UNet is then used to segment the reconstructed image. One-stage approaches try to combine both tasks and it is often claimed that this combination of tasks is helpful to achieve better reconstructions and segmentations. Our results suggest the contrary finding.
  • Figure 2: Example reconstructions and segmentations for the tested methods on the K2S dataset. False-positive segmentations are marked in red, and false negative segmentations are marked in blue. The RegSeg method does not necessarily generate reconstructions similar to the ground-truth, as this is no objective during training and instead the method focuses soley on the generation of good segmentations.
  • Figure 3: Comparison of the Dice score vs. the acceleration factor (left) as well as a comparison between the SSIM score and the acceleration factor (right) for the K2S dataset. Note that for better readability we did not include RegSeg scores in the SSIM plot on the right, as the method has solely been developed to improve segmentation scores and does not aim at creating a good reconstruction. We provide SSIM scores for RegSeg in Table \ref{['tab:ssim_k2s']}. Interestingly, CS-Rec achieves the highest Dice scores while only achieving relatively low SSIM scores. The PSNR score behaves similarly as the SSIM score across accelerations.
  • Figure 4: Relation between PSNR values and the Dice score for the SKM-TEA dataset with 16$\times$ acceleration (left) and the K2S dataset with 32$\times$ acceleration (right). Every reconstructed image on the test set is a single data point. It can be seen by the regression fit that no positive correlation exists between the PSNR and Dice scores within the methods. This observation is confirmed quantitatively with Spearman's $\rho$ ranging from $0$ to $-0.20$ (all $p-values>0.2$).
  • Figure 5: Analysis of segmentation weight for the TwoDec method on the validation set of SKM-TEA 16x data. Higher weights for the weight of the segmentation loss lead to higher segmentation scores, but the reconstruction quality also declines.
  • ...and 1 more figures