Table of Contents
Fetching ...

DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

Xiaohua Jiang, Yihao Guo, Jian Huang, Yuting Wu, Meiyi Luo, Zhaoyang Xu, Qianni Zhang, Xingru Huang, Hong He, Shaowei Jiang, Jing Ye, Mang Xiao

TL;DR

DEFN tackles indistinct-boundary segmentation in 3D medical imaging by fusing frequency-domain feature extraction with a dual-encoder backbone and a dynamic loss fusion strategy. Key components include FuGH for frequency-domain processing, S3DSA for 3D spatial attention, HSE for channel recalibration, along with SDi data augmentation and the adaptive $L_{DWC}$ loss defined by $L_{Total} = sum_{i=1}^4 lambda_i L_i$ and inclusion of $L_{DeepRanking}$. Evaluated on the OIMHS dataset with augmentation from CARS-30k, the approach achieves state-of-the-art segmentation of macular hole and macular edema and enables real-time 3D fundus reconstruction with ETDRS-based quantitative indices. The results demonstrate that translating data to the frequency domain and dynamically balancing loss components improves robustness to noise and boundary ambiguity, with broad potential for other indistinct-boundary medical structures and 3D reconstructions.

Abstract

The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in available datasets and noise artifacts. To surmount these challenges, we introduced Stochastic Defect Injection (SDi) to augment the representational diversity of challenging indistinct-boundary objects within training corpora. Consequently, we propose the Dual-Encoder Fourier Group Harmonics Network (DEFN) to tailor noise filtration, amplify detailed feature recognition, and bolster representation across diverse medical imaging scenarios. By incorporating Dynamic Weight Composing (DWC) loss dynamically adjusts model's focus based on training progression, DEFN achieves SOTA performance on the OIMHS public dataset, showcasing effectiveness in indistinct boundary contexts. Source code for DEFN is available at: https://github.com/IMOP-lab/DEFN-pytorch.

DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

TL;DR

DEFN tackles indistinct-boundary segmentation in 3D medical imaging by fusing frequency-domain feature extraction with a dual-encoder backbone and a dynamic loss fusion strategy. Key components include FuGH for frequency-domain processing, S3DSA for 3D spatial attention, HSE for channel recalibration, along with SDi data augmentation and the adaptive loss defined by and inclusion of . Evaluated on the OIMHS dataset with augmentation from CARS-30k, the approach achieves state-of-the-art segmentation of macular hole and macular edema and enables real-time 3D fundus reconstruction with ETDRS-based quantitative indices. The results demonstrate that translating data to the frequency domain and dynamically balancing loss components improves robustness to noise and boundary ambiguity, with broad potential for other indistinct-boundary medical structures and 3D reconstructions.

Abstract

The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in available datasets and noise artifacts. To surmount these challenges, we introduced Stochastic Defect Injection (SDi) to augment the representational diversity of challenging indistinct-boundary objects within training corpora. Consequently, we propose the Dual-Encoder Fourier Group Harmonics Network (DEFN) to tailor noise filtration, amplify detailed feature recognition, and bolster representation across diverse medical imaging scenarios. By incorporating Dynamic Weight Composing (DWC) loss dynamically adjusts model's focus based on training progression, DEFN achieves SOTA performance on the OIMHS public dataset, showcasing effectiveness in indistinct boundary contexts. Source code for DEFN is available at: https://github.com/IMOP-lab/DEFN-pytorch.
Paper Structure (31 sections, 22 equations, 16 figures, 7 tables)

This paper contains 31 sections, 22 equations, 16 figures, 7 tables.

Figures (16)

  • Figure 1: Indistinct-boundary challenges in medical imaging segmentation. (a, b) Hypopharyngeal cancer presents difficulties in boundary delineation due to submucosal growth. (c, d) Macular holes pose segmentation challenges with their inferred upper boundaries. (e) Calcific shadows in coronary artery IVUS imaging obscure arterial boundaries. (f) Side-branches further complicate segmentation.
  • Figure 2: Schematic representation of the project workflow encompassing a data augmentation pipeline utilizing SDi, a data amplification technology, the pre-training and fine-tuning phases for the DEFN, along with the segmentation pipeline. The SDi includes two strategies: isolated and comprehensive injection. The DWC, a dynamic weight composing network optimization strategy, is integrated into this workflow to enhance the model performance.
  • Figure 3: Results of employing SDi: (a) original image, (b) image processed with the isolated injection strategy, (c) image processed with the comprehensive injection strategy, (d) corresponding mask for (a), (e) corresponding mask for (b), and (f) corresponding mask for (c). Different colors in masks represent different eye conditions: the green area represents the retina, the red area represents the macular hole, and the blue area represents the macular edema.
  • Figure 4: Schematic representation of the DEFN architecture, encompassing a main branch, an HSE branch, and a decoding branch. Within the HSE branch, five consecutive HSE modules are interlaced with Norm operations. The Main branch is structured with a Conv layer, followed by four consecutive FuGH modules interlaced with MaxPooling operations. The Decoding branch comprises five upsampling layers. Integral to this architecture is a Simplified 3D Spatial Attention mechanism, honing the model's focus on relevant spatial features.
  • Figure 5: Schematic representation of the FuGH module, initiating with an FFT operation that segregates the input into real and imaginary components. These components are then sequentially channeled through a Conv layer, a GELU activation function, and a subsequent Conv layer on both upper and lower pathways. Following these processing stages, an IFFT operation is executed to yield the final output.
  • ...and 11 more figures