Table of Contents
Fetching ...

TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT

Rulin Zhou, Yingjie Feng, Guankun Wang, Xiaopin Zhong, Zongze Wu, Qiang Wu, Xi Zhang

TL;DR

This paper addresses the challenge of segmenting adenoid hypertrophy in CT where boundaries are often indistinct. It introduces TSUBF-Net, a 3D UNet-like framework augmented with the Trans-Spatial Perception (TSP) module and Bi-direction Sample Collaborated Fusion (BSCF), plus a Sobel gradient loss to enforce boundary smoothness. The approach achieves state-of-the-art performance on the Adenoid Hypertrophy Segmentation Dataset (AHSD) and competitive results on public datasets such as ACDC and MSD-Lung, demonstrating improved boundary handling and segmentation accuracy. The proposed methods hold promise for computer-assisted preoperative planning and generalize to other 3D organ segmentation tasks, with future work aimed at expanding to additional airway structures.

Abstract

Adenoid hypertrophy stands as a common cause of obstructive sleep apnea-hypopnea syndrome in children. It is characterized by snoring, nasal congestion, and growth disorders. Computed Tomography (CT) emerges as a pivotal medical imaging modality, utilizing X-rays and advanced computational techniques to generate detailed cross-sectional images. Within the realm of pediatric airway assessments, CT imaging provides an insightful perspective on the shape and volume of enlarged adenoids. Despite the advances of deep learning methods for medical imaging analysis, there remains an emptiness in the segmentation of adenoid hypertrophy in CT scans. To address this research gap, we introduce TSUBF-Nett (Trans-Spatial UNet-like Network based on Bi-direction Fusion), a 3D medical image segmentation framework. TSUBF-Net is engineered to effectively discern intricate 3D spatial interlayer features in CT scans and enhance the extraction of boundary-blurring features. Notably, we propose two innovative modules within the U-shaped network architecture:the Trans-Spatial Perception module (TSP) and the Bi-directional Sampling Collaborated Fusion module (BSCF).These two modules are in charge of operating during the sampling process and strategically fusing down-sampled and up-sampled features, respectively. Furthermore, we introduce the Sobel loss term, which optimizes the smoothness of the segmentation results and enhances model accuracy. Extensive 3D segmentation experiments are conducted on several datasets. TSUBF-Net is superior to the state-of-the-art methods with the lowest HD95: 7.03, IoU:85.63, and DSC: 92.26 on our own AHSD dataset. The results in the other two public datasets also demonstrate that our methods can robustly and effectively address the challenges of 3D segmentation in CT scans.

TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT

TL;DR

This paper addresses the challenge of segmenting adenoid hypertrophy in CT where boundaries are often indistinct. It introduces TSUBF-Net, a 3D UNet-like framework augmented with the Trans-Spatial Perception (TSP) module and Bi-direction Sample Collaborated Fusion (BSCF), plus a Sobel gradient loss to enforce boundary smoothness. The approach achieves state-of-the-art performance on the Adenoid Hypertrophy Segmentation Dataset (AHSD) and competitive results on public datasets such as ACDC and MSD-Lung, demonstrating improved boundary handling and segmentation accuracy. The proposed methods hold promise for computer-assisted preoperative planning and generalize to other 3D organ segmentation tasks, with future work aimed at expanding to additional airway structures.

Abstract

Adenoid hypertrophy stands as a common cause of obstructive sleep apnea-hypopnea syndrome in children. It is characterized by snoring, nasal congestion, and growth disorders. Computed Tomography (CT) emerges as a pivotal medical imaging modality, utilizing X-rays and advanced computational techniques to generate detailed cross-sectional images. Within the realm of pediatric airway assessments, CT imaging provides an insightful perspective on the shape and volume of enlarged adenoids. Despite the advances of deep learning methods for medical imaging analysis, there remains an emptiness in the segmentation of adenoid hypertrophy in CT scans. To address this research gap, we introduce TSUBF-Nett (Trans-Spatial UNet-like Network based on Bi-direction Fusion), a 3D medical image segmentation framework. TSUBF-Net is engineered to effectively discern intricate 3D spatial interlayer features in CT scans and enhance the extraction of boundary-blurring features. Notably, we propose two innovative modules within the U-shaped network architecture:the Trans-Spatial Perception module (TSP) and the Bi-directional Sampling Collaborated Fusion module (BSCF).These two modules are in charge of operating during the sampling process and strategically fusing down-sampled and up-sampled features, respectively. Furthermore, we introduce the Sobel loss term, which optimizes the smoothness of the segmentation results and enhances model accuracy. Extensive 3D segmentation experiments are conducted on several datasets. TSUBF-Net is superior to the state-of-the-art methods with the lowest HD95: 7.03, IoU:85.63, and DSC: 92.26 on our own AHSD dataset. The results in the other two public datasets also demonstrate that our methods can robustly and effectively address the challenges of 3D segmentation in CT scans.

Paper Structure

This paper contains 21 sections, 11 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Schematic representation of tissue structures in CT of the head.
  • Figure 2: Common segmentation task vs. adenoid hypertrophy segmentation task: Unlike common segmentation tasks, the boundaries of the segmented objects in the adenoid hypertrophy segmentation task are vague and unclear.
  • Figure 3: Overview of the proposed TSUBF-Net framework. TSUBF-Net adopts a U-shaped structure. Among them, the design of combining convolutional feature extraction, TSP module, and BSCF module is adopted in both up-sampling and down-sampling paths. In the model down-sampling path, the original image first passes through the patch embedding layer, and the feature extraction is performed on the original data through the operation of convolution, which makes the feature size become H/4*W/4*D/4*$C_1$. Similarly, the feature size of the 3D will be decreasing exponentially with the four times of the down-sampling structure, and its channel features will be increasing with the down-sampling.
  • Figure 4: Detailed framework of the TSP module: the proposed TSP module is divided into a channel attention module and an inter-layer attention module, in which the channel attention carries out one head of attention in the channel dimension; the inter-layer attention consists of inter-layer attentions in three directions so that the inter-layer attention module is a three-head of attention, where the Q and K matrices are shared. Then it goes through a series of convolutional layers.
  • Figure 5: Detail framework of the BSCF module: the proposed BSCF module utilizes the same 3×3×3 convolution kernel for the first step of processing the up and down sampled features, thus emphasizing the consistency between the two, immediately followed by 1×1×1 convolution for further feature extraction, and then finally the attention mechanism is used to achieve the correction of the up-sampled feature information based on the up-sampled segmentation information. To emphasize the spatial features, TSP is used for the attention mechanism here.
  • ...and 3 more figures