Table of Contents
Fetching ...

MFA-Net: Multi-Scale feature fusion attention network for liver tumor segmentation

Yanli Yuan, Bingbing Wang, Chuan Zhang, Jingyi Xu, Ximeng Liu, Liehuang Zhu

TL;DR

This work tackles liver tumor segmentation from CT images by addressing multi-scale feature fusion in F-CNNs through MFA-Net, an attention-based framework that deploys SCSE blocks for simultaneous spatial and channel recalibration within a U-Net backbone. MFA-Net uses four spatial and four channel attention blocks, implemented via SSCE and CSSE submodules, in parallel to guide feature fusion across scales. Experiments on 3D-IRCADb-01 and LiTS 2017 show MFA-Net generally improves segmentation metrics and visual quality, though AttUNet can outperform on LiTS 2017 in some cases. The findings underscore the effectiveness of spatial excitation and parallel SE-based attention in medical image segmentation, suggesting broader application of SE variants for enhanced performance.

Abstract

Segmentation of organs of interest in medical CT images is beneficial for diagnosis of diseases. Though recent methods based on Fully Convolutional Neural Networks (F-CNNs) have shown success in many segmentation tasks, fusing features from images with different scales is still a challenge: (1) Due to the lack of spatial awareness, F-CNNs share the same weights at different spatial locations. (2) F-CNNs can only obtain surrounding information through local receptive fields. To address the above challenge, we propose a new segmentation framework based on attention mechanisms, named MFA-Net (Multi-Scale Feature Fusion Attention Network). The proposed framework can learn more meaningful feature maps among multiple scales and result in more accurate automatic segmentation. We compare our proposed MFA-Net with SOTA methods on two 2D liver CT datasets. The experimental results show that our MFA-Net produces more precise segmentation on images with different scales.

MFA-Net: Multi-Scale feature fusion attention network for liver tumor segmentation

TL;DR

This work tackles liver tumor segmentation from CT images by addressing multi-scale feature fusion in F-CNNs through MFA-Net, an attention-based framework that deploys SCSE blocks for simultaneous spatial and channel recalibration within a U-Net backbone. MFA-Net uses four spatial and four channel attention blocks, implemented via SSCE and CSSE submodules, in parallel to guide feature fusion across scales. Experiments on 3D-IRCADb-01 and LiTS 2017 show MFA-Net generally improves segmentation metrics and visual quality, though AttUNet can outperform on LiTS 2017 in some cases. The findings underscore the effectiveness of spatial excitation and parallel SE-based attention in medical image segmentation, suggesting broader application of SE variants for enhanced performance.

Abstract

Segmentation of organs of interest in medical CT images is beneficial for diagnosis of diseases. Though recent methods based on Fully Convolutional Neural Networks (F-CNNs) have shown success in many segmentation tasks, fusing features from images with different scales is still a challenge: (1) Due to the lack of spatial awareness, F-CNNs share the same weights at different spatial locations. (2) F-CNNs can only obtain surrounding information through local receptive fields. To address the above challenge, we propose a new segmentation framework based on attention mechanisms, named MFA-Net (Multi-Scale Feature Fusion Attention Network). The proposed framework can learn more meaningful feature maps among multiple scales and result in more accurate automatic segmentation. We compare our proposed MFA-Net with SOTA methods on two 2D liver CT datasets. The experimental results show that our MFA-Net produces more precise segmentation on images with different scales.
Paper Structure (16 sections, 2 equations, 3 figures, 2 tables)

This paper contains 16 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Our proposed model MFA-Net. We adopt four channel attention blocks (${CA_{1-4}}$) and four spatial attention blocks (${SA_{1-4}}$) in the U-Net architecture. All blocks are located after every two successive convolution layers of the encoder, and the spatial attention block and the channel attention block are used in parallel each time.
  • Figure 2: Network architecture SCSE with squeeze & excitation (SE) blocks.
  • Figure 3: Visual comparison between MFA-Net and other networks based on U-Net for liver tumor segmentation. Red arrows highlight some mis-segmentation. Our MFA-Net achieves better results.