Table of Contents
Fetching ...

REHRSeg: Unleashing the Power of Self-Supervised Super-Resolution for Resource-Efficient 3D MRI Segmentation

Zhiyun Song, Yinjie Zhao, Xiaomin Li, Manman Fei, Xiangyu Zhao, Mengjun Liu, Cunjian Chen, Chung-Hsing Yeh, Qian Wang, Guoyan Zheng, Songtao Ai, Lichi Zhang

TL;DR

Experimental results demonstrate that REHRSeg achieves high-quality HR segmentation without intensive supervision, while also significantly improving the baseline performance for LR segmentation.

Abstract

High-resolution (HR) 3D magnetic resonance imaging (MRI) can provide detailed anatomical structural information, enabling precise segmentation of regions of interest for various medical image analysis tasks. Due to the high demands of acquisition device, collection of HR images with their annotations is always impractical in clinical scenarios. Consequently, segmentation results based on low-resolution (LR) images with large slice thickness are often unsatisfactory for subsequent tasks. In this paper, we propose a novel Resource-Efficient High-Resolution Segmentation framework (REHRSeg) to address the above-mentioned challenges in real-world applications, which can achieve HR segmentation while only employing the LR images as input. REHRSeg is designed to leverage self-supervised super-resolution (self-SR) to provide pseudo supervision, therefore the relatively easier-to-acquire LR annotated images generated by 2D scanning protocols can be directly used for model training. The main contribution to ensure the effectiveness in self-SR for enhancing segmentation is three-fold: (1) We mitigate the data scarcity problem in the medical field by using pseudo-data for training the segmentation model. (2) We design an uncertainty-aware super-resolution (UASR) head in self-SR to raise the awareness of segmentation uncertainty as commonly appeared on the ROI boundaries. (3) We align the spatial features for self-SR and segmentation through structural knowledge distillation to enable a better capture of region correlations. Experimental results demonstrate that REHRSeg achieves high-quality HR segmentation without intensive supervision, while also significantly improving the baseline performance for LR segmentation.

REHRSeg: Unleashing the Power of Self-Supervised Super-Resolution for Resource-Efficient 3D MRI Segmentation

TL;DR

Experimental results demonstrate that REHRSeg achieves high-quality HR segmentation without intensive supervision, while also significantly improving the baseline performance for LR segmentation.

Abstract

High-resolution (HR) 3D magnetic resonance imaging (MRI) can provide detailed anatomical structural information, enabling precise segmentation of regions of interest for various medical image analysis tasks. Due to the high demands of acquisition device, collection of HR images with their annotations is always impractical in clinical scenarios. Consequently, segmentation results based on low-resolution (LR) images with large slice thickness are often unsatisfactory for subsequent tasks. In this paper, we propose a novel Resource-Efficient High-Resolution Segmentation framework (REHRSeg) to address the above-mentioned challenges in real-world applications, which can achieve HR segmentation while only employing the LR images as input. REHRSeg is designed to leverage self-supervised super-resolution (self-SR) to provide pseudo supervision, therefore the relatively easier-to-acquire LR annotated images generated by 2D scanning protocols can be directly used for model training. The main contribution to ensure the effectiveness in self-SR for enhancing segmentation is three-fold: (1) We mitigate the data scarcity problem in the medical field by using pseudo-data for training the segmentation model. (2) We design an uncertainty-aware super-resolution (UASR) head in self-SR to raise the awareness of segmentation uncertainty as commonly appeared on the ROI boundaries. (3) We align the spatial features for self-SR and segmentation through structural knowledge distillation to enable a better capture of region correlations. Experimental results demonstrate that REHRSeg achieves high-quality HR segmentation without intensive supervision, while also significantly improving the baseline performance for LR segmentation.

Paper Structure

This paper contains 26 sections, 11 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Comparison between (a) conventional high-resolution segmentation from low-resolution image method and (b) the proposed REHRSeg framework. Instead of using high-resolution annotations, we use a self-supervised super-resolution model to provide pseudo supervision for HR segmentation task, and further explore its capability to enhance the segmentation model from three different perspectives.
  • Figure 2: Overall architecture of the proposed REHRSeg method, which includes two training stages: (a) Training of self-SR model for annotated MR images, and (b) Super-resolution guided segmentation. In the first stage, the self-SR model is initialized from the video frame interpolation (FI) model, and we introduce an uncertainty-aware super-resolution (UASR) head to ensure the awareness of uncertainty regions difficult to reconstruct. The self-supervision is achieved by learning from the mapping between the downsampled LR data and original LR data. In the second stage, self-SR can provide pseudo supervision for the segmentation task with the synthetic data. The uncertainty is used to help the segmentation model to recognize the blurred boundaries. We also introduce structural knowledge distillation (KD) between self-SR and segmentation models to help capture important correlations between regions.
  • Figure 3: Illustration of the proposed uncertainty-aware super-resolution (UASR) head for self-SR. The features from the last layer is processed through three independent branches, which produce the intermediate image maps, uncertainty maps, and segmentation maps. The final outputs for SR results are obtained by multiplying each intermediate generation with the uncertainty maps, followed by the addition operator.
  • Figure 4: Illustration of the structural knowledge distillation. The feature maps $\mathcal{F}^{sr}$ and $\mathcal{F}^{seg}$ are respectively extracted from self-SR and segmentation model. Knowledge distillation is performed by correlation distillation on the constructed affinity graph and spatial distillation on the feature maps.
  • Figure 5: Qualitative results for low-resolution segmentation on Meningioma-SEG-CLASS dataset (the first two rows) and an in-house pelvic tumor dataset (the last two rows). The red lines denote the ground truth and the blue lines denote the predictions.
  • ...and 2 more figures