Table of Contents
Fetching ...

Redefining the Down-Sampling Scheme of U-Net for Precision Biomedical Image Segmentation

Mingjie Li, Yizheng Chen, Md Tauhidul Islam, Lei Xing

TL;DR

This paper introduces a simple but effective strategy, called Stair Pooling, which moderates the pace of down-sampling and reduces information loss by leveraging a sequence of concatenated small and narrow pooling operations in varied orientations.

Abstract

U-Net architectures have been instrumental in advancing biomedical image segmentation (BIS) but often struggle with capturing long-range information. One reason is the conventional down-sampling techniques that prioritize computational efficiency at the expense of information retention. This paper introduces a simple but effective strategy, we call it Stair Pooling, which moderates the pace of down-sampling and reduces information loss by leveraging a sequence of concatenated small and narrow pooling operations in varied orientations. Specifically, our method modifies the reduction in dimensionality within each 2D pooling step from $\frac{1}{4}$ to $\frac{1}{2}$. This approach can also be adapted for 3D pooling to preserve even more information. Such preservation aids the U-Net in more effectively reconstructing spatial details during the up-sampling phase, thereby enhancing its ability to capture long-range information and improving segmentation accuracy. Extensive experiments on three BIS benchmarks demonstrate that the proposed Stair Pooling can increase both 2D and 3D U-Net performance by an average of 3.8\% in Dice scores. Moreover, we leverage the transfer entropy to select the optimal down-sampling paths and quantitatively show how the proposed Stair Pooling reduces the information loss.

Redefining the Down-Sampling Scheme of U-Net for Precision Biomedical Image Segmentation

TL;DR

This paper introduces a simple but effective strategy, called Stair Pooling, which moderates the pace of down-sampling and reduces information loss by leveraging a sequence of concatenated small and narrow pooling operations in varied orientations.

Abstract

U-Net architectures have been instrumental in advancing biomedical image segmentation (BIS) but often struggle with capturing long-range information. One reason is the conventional down-sampling techniques that prioritize computational efficiency at the expense of information retention. This paper introduces a simple but effective strategy, we call it Stair Pooling, which moderates the pace of down-sampling and reduces information loss by leveraging a sequence of concatenated small and narrow pooling operations in varied orientations. Specifically, our method modifies the reduction in dimensionality within each 2D pooling step from to . This approach can also be adapted for 3D pooling to preserve even more information. Such preservation aids the U-Net in more effectively reconstructing spatial details during the up-sampling phase, thereby enhancing its ability to capture long-range information and improving segmentation accuracy. Extensive experiments on three BIS benchmarks demonstrate that the proposed Stair Pooling can increase both 2D and 3D U-Net performance by an average of 3.8\% in Dice scores. Moreover, we leverage the transfer entropy to select the optimal down-sampling paths and quantitatively show how the proposed Stair Pooling reduces the information loss.
Paper Structure (15 sections, 8 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 15 sections, 8 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: The overview of our proposed Stair Pooling. It splits the original max pooling layer into a series of concatenated small and narrow pooling kernels. To break the linear relationship, each pooling operation is followed by a convolutional layer and a ReLU activation.
  • Figure 2: Qualitative comparison of different approaches on the Synapse dataset. From left to right: Ground Truth, U-Net, SwinUnet, UNet with HWT pooling, our SP UNet and the TE selected variant.
  • Figure 3: Segmentation results and sizes of different models on the Synapse dataset.
  • Figure 4: Correlations between DSC and TE on the KiT23 dataset.
  • Figure 5: Violin plots for visualizing distributions of transfer entropy across different paths on three datasets.