Relation U-Net

Sheng He; Rina Bao; P. Ellen Grant; Yangming Ou

Relation U-Net

Sheng He, Rina Bao, P. Ellen Grant, Yangming Ou

TL;DR

Relation U-Net addresses the need for per-image confidence in medical image segmentation without ground-truth by introducing a two-input, four-output network that learns both per-image segmentations and their pairwise relations. It outputs two segmentation maps ($\hat{s}_1, \hat{s}_2$) and two relation maps ($\hat{r}^p, \hat{r}^c$), with a confidence score $\mathcal{C}$ defined as the Dice-based discrepancy between $\hat{r}^p$ and $\hat{r}^c$. Across LiTS, Hippocampus, BraTS, and ISIC, Relation U-Net improves Dice accuracy over vanilla U-Net and MC-Dropout baselines, and $\mathcal{C}$ correlates with segmentation performance, enabling ranking of test samples by difficulty without ground-truth. The approach supports robust, interpretable segmentation in clinical workflows by identifying and prioritizing challenging cases for expert review, while providing improved per-image predictions. Overall, the method demonstrates how pairwise relations and an explicit confidence signal can enhance segmentation accuracy and practical trustworthiness in medical imaging.

Abstract

Towards clinical interpretations, this paper presents a new ''output-with-confidence'' segmentation neural network with multiple input images and multiple output segmentation maps and their pairwise relations. A confidence score of the test image without ground-truth can be estimated from the difference among the estimated relation maps. We evaluate the method based on the widely used vanilla U-Net for segmentation and our new model is named Relation U-Net which can output segmentation maps of the input images as well as an estimated confidence score of the test image without ground-truth. Experimental results on four public datasets show that Relation U-Net can not only provide better accuracy than vanilla U-Net but also estimate a confidence score which is linearly correlated to the segmentation accuracy on test images.

Relation U-Net

TL;DR

) and two relation maps (

), with a confidence score

defined as the Dice-based discrepancy between

and

. Across LiTS, Hippocampus, BraTS, and ISIC, Relation U-Net improves Dice accuracy over vanilla U-Net and MC-Dropout baselines, and

correlates with segmentation performance, enabling ranking of test samples by difficulty without ground-truth. The approach supports robust, interpretable segmentation in clinical workflows by identifying and prioritizing challenging cases for expert review, while providing improved per-image predictions. Overall, the method demonstrates how pairwise relations and an explicit confidence signal can enhance segmentation accuracy and practical trustworthiness in medical imaging.

Abstract

Paper Structure (13 sections, 5 figures, 2 tables)

This paper contains 13 sections, 5 figures, 2 tables.

Introduction
Method
Relation U-Net
The structure of neural network
Estimating of Confidence Score
Experimental results
Datasets
Implementation details
Accuracy of Relation U-Net
Accuracy of Relation U-Net when the input images sampled from the same image
Accuracy of Relation U-Net when the input images sampled from the different images
Correlation between the accuracy and confidence score $\mathcal{C}$
Conclusion

Figures (5)

Figure 1: Illustration of the proposed Relation U-net.
Figure 2: The comparison of the vanilla U-Net (a) and the proposed Relation U-Net (b). (Zoom in for a better visualization)
Figure 3: Examples of images on the ISIC2018 dataset with relation segmentation maps: $\hat{r}^p$ (green contours) and $\hat{r}^c$ (red contours). The $\hat{r}^p$ and $\hat{r}^c$ are the same on easy samples while are different on difficult samples. $\mathcal{C}$ measures how the consistency of the segmentation maps between $\hat{r}^p$ and $\hat{r}^c$.
Figure 4: (a) Comparison of the Pearson coefficient between the segmentation accuracy and confidence scores. (b) The segmentation accuracy (y-axis) for the test images thresholded by the confidence score (x-axis). The results are computed by the average over the five-fold cross-validation. On the test images without ground-truth, the segmentation accuracy increases with an increase in confidence scores.
Figure 5: Examples of the images (the first row on each data set) and their corresponding segmentation results (the second row on each dataset) ranked by the confidence score $\mathcal{C}$. The red masks denote the ground-truth and the green contours denote the segmentation results. The Dice similarity coefficient $\mathcal{D}$ and confidence score $\mathcal{C}$ are shown under each example.

Relation U-Net

TL;DR

Abstract

Relation U-Net

Authors

TL;DR

Abstract

Table of Contents

Figures (5)