Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

Xin Yu; Yucheng Tang; Qi Yang; Ho Hin Lee; Shunxing Bao; Yuankai Huo; Bennett A. Landman

Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Shunxing Bao, Yuankai Huo, Bennett A. Landman

TL;DR

The paper addresses the need to perform comprehensive whole-brain segmentation while also estimating intracranial measurements (TICV and PFV). It extends the hierarchical transformer UNesT with two additional convolutional heads to jointly predict 133 brain regions and TICV/PFV, employing a two-stage training regime: pseudo-label pretraining on 4859 T1w volumes from multiple sites and finetuning on 45 OASIS volumes with full TICV/PFV labels. Results show TICV/PFV DSCs of approximately $0.96$ and $0.95$, respectively, with 132-brain-region DSC near the baseline ($\sim$0.75), indicating a manageable trade-off and the effectiveness of loss-weight scheduling via $L = L_{brain} + \beta_1 L_{TICV} + \beta_2 L_{PFV}$. The approach enables integrated intracranial measurements within a single model and provides a containerized workflow for practical deployment, enhancing its utility for downstream neuroimaging analyses.

Abstract

Whole brain segmentation with magnetic resonance imaging (MRI) enables the non-invasive measurement of brain regions, including total intracranial volume (TICV) and posterior fossa volume (PFV). Enhancing the existing whole brain segmentation methodology to incorporate intracranial measurements offers a heightened level of comprehensiveness in the analysis of brain structures. Despite its potential, the task of generalizing deep learning techniques for intracranial measurements faces data availability constraints due to limited manually annotated atlases encompassing whole brain and TICV/PFV labels. In this paper, we enhancing the hierarchical transformer UNesT for whole brain segmentation to achieve segmenting whole brain with 133 classes and TICV/PFV simultaneously. To address the problem of data scarcity, the model is first pretrained on 4859 T1-weighted (T1w) 3D volumes sourced from 8 different sites. These volumes are processed through a multi-atlas segmentation pipeline for label generation, while TICV/PFV labels are unavailable. Subsequently, the model is finetuned with 45 T1w 3D volumes from Open Access Series Imaging Studies (OASIS) where both 133 whole brain classes and TICV/PFV labels are available. We evaluate our method with Dice similarity coefficients(DSC). We show that our model is able to conduct precise TICV/PFV estimation while maintaining the 132 brain regions performance at a comparable level. Code and trained model are available at: https://github.com/MASILab/UNesT/tree/main/wholebrainSeg.

Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

TL;DR

and

, respectively, with 132-brain-region DSC near the baseline (

0.75), indicating a manageable trade-off and the effectiveness of loss-weight scheduling via

. The approach enables integrated intracranial measurements within a single model and provides a containerized workflow for practical deployment, enhancing its utility for downstream neuroimaging analyses.

Abstract

Paper Structure (10 sections, 1 equation, 5 figures, 2 tables)

This paper contains 10 sections, 1 equation, 5 figures, 2 tables.

INTRODUCTION
Methods
Data
UNesT
TICV/PFV Estimation
Experiments and Results
Implementation Details
Experimental and Discussion
Containerized Implementation
Conclusion

Figures (5)

Figure 1: Overview of the model architecture. Input image patch sequence are aggregated into block in each hierarchy and fed into transformer block separately. Blocks are deblockified back to image patch sequence space at the end of each hierarchy for inter-blocks communication. In the pretraining stage, the final feature maps is transformed to have 133 channels. In the finetuning stage, the final feature maps is transformed to 3 different feature maps, each with 133, 1, and 1 channels respectively, to facilitate segmentation for 132 brain regions, TICV, and PFV.
Figure 2: Visualization of the 132 brain regions segmentation on a randomly selected case in the test set.
Figure 3: Visualization of the TICV/PFV segmentation on a randomly selected case in the test set.
Figure 4: Training curve on the validation set. After reducing the weight of TICV/PFV at 20K iteration, the performance of TICV drops gradually. However, the performance of the 132 brain regions shows a marked improvement.
Figure 5: Overview of the Singularity workflow. The Singularity can accommodate both skull-stripped input and non-skull stripped input.

Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

TL;DR

Abstract

Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

Authors

TL;DR

Abstract

Table of Contents

Figures (5)