Table of Contents
Fetching ...

HES-UNet: A U-Net for Hepatic Echinococcosis Lesion Segmentation

Jiayan Chen, Kai Li, Zhanjin Wang, Zhan Wang, Jianqiang Huang

TL;DR

HE segmentation in CT images is challenged by irregular lesion shapes and boundary ambiguity, particularly in resource-limited regions. HES-UNet integrates a multi-scale feature integration encoder (MFSI), a multi-scale global feature filtering module (MGF), a progressive fusion decoder (PF), and a deep supervision (DS) mechanism to jointly capture local and global information. The architecture replaces max-pooling with a multi-directional downsampling block (MDB), aggregates multi-scale features via a multi-scale aggregation block (MAB), and fuses features with global attention modules (GAMs) and upsampling blocks (MUB), achieving a Dice Similarity Coefficient of $89.21\%$, outperforming several state-of-the-art models. Ablation studies confirm the contribution of MDB, MAB, and MUB to improved segmentation performance, underscoring the effectiveness of multi-scale and global-feature fusion for HE lesion delineation in CT images.

Abstract

Hepatic echinococcosis (HE) is a prevalent disease in economically underdeveloped pastoral areas, where adequate medical resources are usually lacking. Existing methods often ignore multi-scale feature fusion or focus only on feature fusion between adjacent levels, which may lead to insufficient feature fusion. To address these issues, we propose HES-UNet, an efficient and accurate model for HE lesion segmentation. This model combines convolutional layers and attention modules to capture local and global features. During downsampling, the multi-directional downsampling block (MDB) is employed to integrate high-frequency and low-frequency features, effectively extracting image details. The multi-scale aggregation block (MAB) aggregates multi-scale feature information. In contrast, the multi-scale upsampling Block (MUB) learns highly abstract features and supplies this information to the skip connection module to fuse multi-scale features. Due to the distinct regional characteristics of HE, there is currently no publicly available high-quality dataset for training our model. We collected CT slice data from 268 patients at a certain hospital to train and evaluate the model. The experimental results show that HES-UNet achieves state-of-the-art performance on our dataset, achieving an overall Dice Similarity Coefficient (DSC) of 89.21%, which is 1.09% higher than that of TransUNet. The project page is available at https://chenjiayan-qhu.github.io/HES-UNet-page.

HES-UNet: A U-Net for Hepatic Echinococcosis Lesion Segmentation

TL;DR

HE segmentation in CT images is challenged by irregular lesion shapes and boundary ambiguity, particularly in resource-limited regions. HES-UNet integrates a multi-scale feature integration encoder (MFSI), a multi-scale global feature filtering module (MGF), a progressive fusion decoder (PF), and a deep supervision (DS) mechanism to jointly capture local and global information. The architecture replaces max-pooling with a multi-directional downsampling block (MDB), aggregates multi-scale features via a multi-scale aggregation block (MAB), and fuses features with global attention modules (GAMs) and upsampling blocks (MUB), achieving a Dice Similarity Coefficient of , outperforming several state-of-the-art models. Ablation studies confirm the contribution of MDB, MAB, and MUB to improved segmentation performance, underscoring the effectiveness of multi-scale and global-feature fusion for HE lesion delineation in CT images.

Abstract

Hepatic echinococcosis (HE) is a prevalent disease in economically underdeveloped pastoral areas, where adequate medical resources are usually lacking. Existing methods often ignore multi-scale feature fusion or focus only on feature fusion between adjacent levels, which may lead to insufficient feature fusion. To address these issues, we propose HES-UNet, an efficient and accurate model for HE lesion segmentation. This model combines convolutional layers and attention modules to capture local and global features. During downsampling, the multi-directional downsampling block (MDB) is employed to integrate high-frequency and low-frequency features, effectively extracting image details. The multi-scale aggregation block (MAB) aggregates multi-scale feature information. In contrast, the multi-scale upsampling Block (MUB) learns highly abstract features and supplies this information to the skip connection module to fuse multi-scale features. Due to the distinct regional characteristics of HE, there is currently no publicly available high-quality dataset for training our model. We collected CT slice data from 268 patients at a certain hospital to train and evaluate the model. The experimental results show that HES-UNet achieves state-of-the-art performance on our dataset, achieving an overall Dice Similarity Coefficient (DSC) of 89.21%, which is 1.09% higher than that of TransUNet. The project page is available at https://chenjiayan-qhu.github.io/HES-UNet-page.

Paper Structure

This paper contains 16 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The overall pipeline of HES-UNet. HES-UNet consists of four main components: MFSI encoder, MGF module, PF decoder, and DS module.
  • Figure 2: The structure of each module in HES-UNet: (A) Encoder Block, (B) Decoder Block, (C) Multi-scale Aggregation Block, (D) Multi-scale Upsampling Block, (E) Multi-directional Downsampling Block, (F) Global Attention Module, and (G) Pixel Shuffle.
  • Figure 3: Our segmentation results compared with other models.