Table of Contents
Fetching ...

Brain Tumor Segmentation Based on Deep Learning, Attention Mechanisms, and Energy-Based Uncertainty Prediction

Zachary Schwehr, Sriman Achanta

TL;DR

A region of interest detection algorithm that was implemented during data preprocessing to locate salient features and remove extraneous MRI data was proposed, allowing for more aggressive data augmentations and deeper neural networks.

Abstract

Brain tumors are one of the deadliest forms of cancer with a mortality rate of over 80%. A quick and accurate diagnosis is crucial to increase the chance of survival. However, in medical analysis, the manual annotation and segmentation of a brain tumor can be a complicated task. Multiple MRI modalities are typically analyzed as they provide unique information regarding the tumor regions. Although these MRI modalities are helpful for segmenting gliomas, they tend to increase overfitting and computation. This paper proposes a region of interest detection algorithm that is implemented during data preprocessing to locate salient features and remove extraneous MRI data. This decreases the input size, allowing for more aggressive data augmentations and deeper neural networks. Following the preprocessing of the MRI modalities, a fully convolutional autoencoder with soft attention segments the different brain MRIs. When these deep learning algorithms are implemented in practice, analysts and physicians cannot differentiate between accurate and inaccurate predictions. Subsequently, test time augmentations and an energy-based model were used for voxel-based uncertainty predictions. Experimentation was conducted on the BraTS benchmarks and achieved state-of-the-art segmentation performance. Additionally, qualitative results were used to assess the segmentation models and uncertainty predictions.

Brain Tumor Segmentation Based on Deep Learning, Attention Mechanisms, and Energy-Based Uncertainty Prediction

TL;DR

A region of interest detection algorithm that was implemented during data preprocessing to locate salient features and remove extraneous MRI data was proposed, allowing for more aggressive data augmentations and deeper neural networks.

Abstract

Brain tumors are one of the deadliest forms of cancer with a mortality rate of over 80%. A quick and accurate diagnosis is crucial to increase the chance of survival. However, in medical analysis, the manual annotation and segmentation of a brain tumor can be a complicated task. Multiple MRI modalities are typically analyzed as they provide unique information regarding the tumor regions. Although these MRI modalities are helpful for segmenting gliomas, they tend to increase overfitting and computation. This paper proposes a region of interest detection algorithm that is implemented during data preprocessing to locate salient features and remove extraneous MRI data. This decreases the input size, allowing for more aggressive data augmentations and deeper neural networks. Following the preprocessing of the MRI modalities, a fully convolutional autoencoder with soft attention segments the different brain MRIs. When these deep learning algorithms are implemented in practice, analysts and physicians cannot differentiate between accurate and inaccurate predictions. Subsequently, test time augmentations and an energy-based model were used for voxel-based uncertainty predictions. Experimentation was conducted on the BraTS benchmarks and achieved state-of-the-art segmentation performance. Additionally, qualitative results were used to assess the segmentation models and uncertainty predictions.
Paper Structure (17 sections, 14 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 17 sections, 14 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Fig. 1. The architecture of the binary brain tumor segmentation model. This model is a U-Net-like structure with instance normalization, strided convolutions, and the ELU activation function. The strided convolutions and transposed convolutional layers have a scale of 2: decreasing and increasing the feature maps by a scale of 2, respectively. The input dimensions are $T \times 128 \times 128 \times 128 \times 4$ where $T$ is the batch size and the dimensions $128 \times 128 \times 128$ represent the size of the 3D brain MRI. There are four channels representing the MRI modalities, T1, T1-Gd, T2, and Flair. The output is $T \times 128 \times 128 \times 128$ representing the $T$ batches of the binarily segmented brain tumors.
  • Figure 2: Fig. 2. ConvBlock1 employs two convolutional layers with a kernel size of ($3 \times 3 \times 3$), instance normalization, and ELU.
  • Figure 3: Fig 3. The architecture of the multiclass segmentation model. It is similar to the binary segmentation model, however, it employs soft attention mechanisms, more filter channels, fewer encoding, and decoding blocks, and ReLU instead of ELU. The input dimensions are $T \times 48 \times 48 \times 128 \times 4$. The output is $T \times 48 \times 48 \times 128 \times 4$ representing the $T$ batches of the segmented brain tumors with the four channels representing the four classes: normal brain tissue, peritumoral edema, enhancing tumor region, and non-enhancing and necrotic tumor region.
  • Figure 4: Fig 4. ConvBlock2 employs two convolutional layers with a kernel size of ($3 \times 3 \times 3$), instance normalization, ReLU, and a channel-based attention algorithm. The channel-based attention algorithm computes the attention coefficient, $\alpha$, which maps the importance of each of the channels where GAP takes the global average of each channel.
  • Figure 5: Fig 5. The Attention Gate (A) is from the Attention U-Net and computes the attention coefficient, $\alpha$, that maps the importance of the different spacial regions in the feature map.
  • ...and 1 more figures