Table of Contents
Fetching ...

Rethinking the Nested U-Net Approach: Enhancing Biomarker Segmentation with Attention Mechanisms and Multiscale Feature Fusion

Saad Wazir, Daeyoung Kim

TL;DR

Biomarker segmentation in medical images is challenged by limited data and morphology/ staining variability. The paper introduces an end-to-end Nested UNet that leverages Multiscale Feature Fusion and Attention Mechanisms, including a Channel Attention Module (CAM), Attention Module (AM), and Edge Enhancement Layer (EEL), with an Edge-Aware Loss (EAL) to emphasize boundaries. Extensive experiments on MoNuSeg, DSB 2018, EM, and TNBC demonstrate state-of-the-art performance with strong cross-dataset generalization and favorable parameter efficiency, validated by ablation studies. The approach advances robust, edge-aware segmentation across modalities and data regimes and shows promise for extending to gland, polyp, and organ segmentation tasks.

Abstract

Identifying biomarkers in medical images is vital for a wide range of biotech applications. However, recent Transformer and CNN based methods often struggle with variations in morphology and staining, which limits their feature extraction capabilities. In medical image segmentation, where data samples are often limited, state-of-the-art (SOTA) methods improve accuracy by using pre-trained encoders, while end-to-end approaches typically fall short due to difficulties in transferring multiscale features effectively between encoders and decoders. To handle these challenges, we introduce a nested UNet architecture that captures both local and global context through Multiscale Feature Fusion and Attention Mechanisms. This design improves feature integration from encoders, highlights key channels and regions, and restores spatial details to enhance segmentation performance. Our method surpasses SOTA approaches, as evidenced by experiments across four datasets and detailed ablation studies. Code: https://github.com/saadwazir/ReN-UNet

Rethinking the Nested U-Net Approach: Enhancing Biomarker Segmentation with Attention Mechanisms and Multiscale Feature Fusion

TL;DR

Biomarker segmentation in medical images is challenged by limited data and morphology/ staining variability. The paper introduces an end-to-end Nested UNet that leverages Multiscale Feature Fusion and Attention Mechanisms, including a Channel Attention Module (CAM), Attention Module (AM), and Edge Enhancement Layer (EEL), with an Edge-Aware Loss (EAL) to emphasize boundaries. Extensive experiments on MoNuSeg, DSB 2018, EM, and TNBC demonstrate state-of-the-art performance with strong cross-dataset generalization and favorable parameter efficiency, validated by ablation studies. The approach advances robust, edge-aware segmentation across modalities and data regimes and shows promise for extending to gland, polyp, and organ segmentation tasks.

Abstract

Identifying biomarkers in medical images is vital for a wide range of biotech applications. However, recent Transformer and CNN based methods often struggle with variations in morphology and staining, which limits their feature extraction capabilities. In medical image segmentation, where data samples are often limited, state-of-the-art (SOTA) methods improve accuracy by using pre-trained encoders, while end-to-end approaches typically fall short due to difficulties in transferring multiscale features effectively between encoders and decoders. To handle these challenges, we introduce a nested UNet architecture that captures both local and global context through Multiscale Feature Fusion and Attention Mechanisms. This design improves feature integration from encoders, highlights key channels and regions, and restores spatial details to enhance segmentation performance. Our method surpasses SOTA approaches, as evidenced by experiments across four datasets and detailed ablation studies. Code: https://github.com/saadwazir/ReN-UNet

Paper Structure

This paper contains 13 sections, 3 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: (a) An overview of the proposed architecture (b) Nested UNet Block (NUB). (c) Channel Attention Module (CAM). (d) Attention Module (AM).
  • Figure 2: Comparison of Trainable parameters, GFlops, and mIoU across models.
  • Figure 3: Qualitative Results Comparison: Black pixels represent the background, while white pixels represent the biomarker. The red box indicates the region where there is either no prediction or over-segmentation.