Table of Contents
Fetching ...

PraNet-V2: Dual-Supervised Reverse Attention for Medical Image Segmentation

Bo-Cheng Hu, Ge-Peng Ji, Dian Shao, Deng-Ping Fan

TL;DR

PraNet-V2 introduces Dual-Supervised Reverse Attention (DSRA) to enable robust multi-class medical image segmentation by decoupling foreground and background learning and performing semantic, class-aware attention fusion. DSRA employs independent foreground/background heads, a background mask with per-class supervision, and a learnable reverse gain to refine predictions across a cascade of decoder stages, yielding improved polyp segmentation and transferable gains when integrated into other architectures. Across four polyp datasets and two backbones, PraNet-V2 consistently outperforms PraNet-V1, and applying DSRA to state-of-the-art models yields mean Dice improvements of 0.50%–1.36%, demonstrating strong generalization and practical impact in medical image segmentation. The work highlights the importance of explicit background modeling and semantic label-space fusion, with potential extensions to open-world anomaly detection in medical imaging.

Abstract

Accurate medical image segmentation is essential for effective diagnosis and treatment. Previously, PraNet-V1 was proposed to enhance polyp segmentation by introducing a reverse attention (RA) module that utilizes background information. However, PraNet-V1 struggles with multi-class segmentation tasks. To address this limitation, we propose PraNet-V2, which, compared to PraNet-V1, effectively performs a broader range of tasks including multi-class segmentation. At the core of PraNet-V2 is the Dual-Supervised Reverse Attention (DSRA) module, which incorporates explicit background supervision, independent background modeling, and semantically enriched attention fusion. Our PraNet-V2 framework demonstrates strong performance on four polyp segmentation datasets. Additionally, by integrating DSRA to iteratively enhance foreground segmentation results in three state-of-the-art semantic segmentation models, we achieve up to a 1.36% improvement in mean Dice score. Code is available at: https://github.com/ai4colonoscopy/PraNet-V2/tree/main/binary_seg/jittor.

PraNet-V2: Dual-Supervised Reverse Attention for Medical Image Segmentation

TL;DR

PraNet-V2 introduces Dual-Supervised Reverse Attention (DSRA) to enable robust multi-class medical image segmentation by decoupling foreground and background learning and performing semantic, class-aware attention fusion. DSRA employs independent foreground/background heads, a background mask with per-class supervision, and a learnable reverse gain to refine predictions across a cascade of decoder stages, yielding improved polyp segmentation and transferable gains when integrated into other architectures. Across four polyp datasets and two backbones, PraNet-V2 consistently outperforms PraNet-V1, and applying DSRA to state-of-the-art models yields mean Dice improvements of 0.50%–1.36%, demonstrating strong generalization and practical impact in medical image segmentation. The work highlights the importance of explicit background modeling and semantic label-space fusion, with potential extensions to open-world anomaly detection in medical imaging.

Abstract

Accurate medical image segmentation is essential for effective diagnosis and treatment. Previously, PraNet-V1 was proposed to enhance polyp segmentation by introducing a reverse attention (RA) module that utilizes background information. However, PraNet-V1 struggles with multi-class segmentation tasks. To address this limitation, we propose PraNet-V2, which, compared to PraNet-V1, effectively performs a broader range of tasks including multi-class segmentation. At the core of PraNet-V2 is the Dual-Supervised Reverse Attention (DSRA) module, which incorporates explicit background supervision, independent background modeling, and semantically enriched attention fusion. Our PraNet-V2 framework demonstrates strong performance on four polyp segmentation datasets. Additionally, by integrating DSRA to iteratively enhance foreground segmentation results in three state-of-the-art semantic segmentation models, we achieve up to a 1.36% improvement in mean Dice score. Code is available at: https://github.com/ai4colonoscopy/PraNet-V2/tree/main/binary_seg/jittor.

Paper Structure

This paper contains 9 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Illustration of the key differences between PraNet-V1 and PraNet-V2 in background modeling and task handling.
  • Figure 2: Overview of the PraNet-V2 framework and DSRA module. The pipeline processes high-level features ($F_2,F_3,F_4$) through a parallel partial decoder (PD) and three DSRA modules. The DSRA module decodes high-level feature $F_{i+1}$ to generate foreground and background segmentation maps, while integrating outputs from deeper DSRA module or PD ($R_{i+1}^F$,$R_{i+1}^B$) to refine the foreground segmentation maps.
  • Figure 3: Segmentation results on ACDC (above) and Synapse (below) datasets, with segmentation errors highlighted in red boxes.