MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation

Rayan Merghani Ahmed; Adnan Iltaf; Mohamed Elmanna; Gang Zhao; Hongliang Li; Yue Du; Bin Li; Shoujun Zhou

MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation

Rayan Merghani Ahmed, Adnan Iltaf, Mohamed Elmanna, Gang Zhao, Hongliang Li, Yue Du, Bin Li, Shoujun Zhou

TL;DR

This work targets coronary DSA segmentation, a task hindered by high intra-class variance and severe background diversity. It introduces SPCL, a hybrid loss that blends supervised contrastive learning with prototypical contrastive learning to enforce semantic embeddings in the encoder and to emphasize hard-background samples, integrated into the MSA-UNet3+ architecture featuring a Multi-Scale Attention Encoder, MSD-Bottleneck, and CAFM. Empirical results on a private dataset show that SPCL improves across multiple baselines and that MSA-UNet3+ with SPCL achieves leading Dice ($87.73\%$) and F1 ($87.78\%$) with superior boundary accuracy (ASD $0.76$, ACD $0.74$), highlighting its clinical relevance for identifying coronary stenosis. The framework offers practical benefits in precise vessel delineation and could be extended with multimodal fusion and lightweight variants for real-time clinical deployment.

Abstract

Accurate segmentation of coronary Digital Subtraction Angiography images is essential to diagnose and treat coronary artery diseases. Despite advances in deep learning, challenges such as high intra-class variance and class imbalance limit precise vessel delineation. Most existing approaches for coronary DSA segmentation cannot address these issues. Also, existing segmentation network's encoders do not directly generate semantic embeddings, which could enable the decoder to reconstruct segmentation masks effectively from these well-defined features. We propose a Supervised Prototypical Contrastive Loss that fuses supervised and prototypical contrastive learning to enhance coronary DSA image segmentation. The supervised contrastive loss enforces semantic embeddings in the encoder, improving feature differentiation. The prototypical contrastive loss allows the model to focus on the foreground class while alleviating the high intra-class variance and class imbalance problems by concentrating only on the hard-to-classify background samples. We implement the proposed SPCL loss within an MSA-UNet3+: a Multi-Scale Attention-Enhanced UNet3+ architecture. The architecture integrates key components: a Multi-Scale Attention Encoder and a Multi-Scale Dilated Bottleneck designed to enhance multi-scale feature extraction and a Contextual Attention Fusion Module built to keep fine-grained details while improving contextual understanding. Experiments on a private coronary DSA dataset show that MSA-UNet3+ outperforms state-of-the-art methods, achieving the highest Dice coefficient and F1-score and significantly reducing ASD and ACD. The developed framework provides clinicians with precise vessel segmentation, enabling accurate identification of coronary stenosis and supporting informed diagnostic and therapeutic decisions. The code will be released at https://github.com/rayanmerghani/MSA-UNet3plus.

MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation

TL;DR

Abstract

MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)