Table of Contents
Fetching ...

Multiscale Encoder and Omni-Dimensional Dynamic Convolution Enrichment in nnU-Net for Brain Tumor Segmentation

Sahaj K. Mistry, Sourav Saini, Aashray Gupta, Aayush Gupta, Sunny Rai, Vinit Jakhetiya, Ujjwal Baid, Sharath Chandra Guntuku

TL;DR

This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture that enhances conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation and proposes a multi-scale attention strategy that harnesses contemporary insights from various scales.

Abstract

Brain tumor segmentation plays a crucial role in computer-aided diagnosis. This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture. Within the nnU-Net architecture's encoder section, we enhance conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation. Simultaneously, we propose a multi-scale attention strategy that harnesses contemporary insights from various scales. Our model's efficacy is demonstrated on diverse datasets from the BraTS-2023 challenge. Integrating omni-dimensional dynamic convolution (ODConv) layers and multi-scale features yields substantial improvement in the nnU-Net architecture's performance across multiple tumor segmentation datasets. Remarkably, our proposed model attains good accuracy during validation for the BraTS Africa dataset. The ODconv source code along with full training code is available on GitHub.

Multiscale Encoder and Omni-Dimensional Dynamic Convolution Enrichment in nnU-Net for Brain Tumor Segmentation

TL;DR

This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture that enhances conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation and proposes a multi-scale attention strategy that harnesses contemporary insights from various scales.

Abstract

Brain tumor segmentation plays a crucial role in computer-aided diagnosis. This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture. Within the nnU-Net architecture's encoder section, we enhance conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation. Simultaneously, we propose a multi-scale attention strategy that harnesses contemporary insights from various scales. Our model's efficacy is demonstrated on diverse datasets from the BraTS-2023 challenge. Integrating omni-dimensional dynamic convolution (ODConv) layers and multi-scale features yields substantial improvement in the nnU-Net architecture's performance across multiple tumor segmentation datasets. Remarkably, our proposed model attains good accuracy during validation for the BraTS Africa dataset. The ODconv source code along with full training code is available on GitHub.
Paper Structure (35 sections, 2 figures, 7 tables)

This paper contains 35 sections, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Tumor Sub-regions Annotations Depicted various annotated tumor sub-regions across different multi-parametric MRI scans. Image panels A-C illustrate the areas designated for assessing algorithm performance, highlighting the enhancing tumor (ET - indicated in yellow) as seen in a T1C scan. This region encompasses the cystic/necrotic components of the core in panel A. In panels B and C, the tumor core (TC - magenta) and the entire tumor (WT - cyan) are delineated, respectively, in corresponding T2W and T2F scans. Panel D provides an overview of the merged segmentations forming the comprehensive tumor sub-region labels given to participants in the BraTS 2021 challenge. These labels comprise the enhancing core (yellow), necrotic/cystic core (red), and edema/invasion (green). The image is obtained from baid2021rsna.
  • Figure 2: nnU-Net with the addition of Multi-Scale and ODConv3D Layers: The architecture comprises a pair of identical encoders, responsible for extracting features from the input at distinct scales. These feature vectors undergo cross-attention processing, and the output is subsequently channeled into a decoder. The decoder's role involves upscaling the features to their original dimensions, effectively generating segmentations.