U-Mamba2: Scaling State Space Models for Dental Anatomy Segmentation in CBCT

Zhi Qin Tan; Xiatian Zhu; Owen Addison; Yunpeng Li

U-Mamba2: Scaling State Space Models for Dental Anatomy Segmentation in CBCT

Zhi Qin Tan, Xiatian Zhu, Owen Addison, Yunpeng Li

TL;DR

To address the challenge of fast, accurate multi-anatomy CBCT segmentation in dentistry, the authors introduce U-Mamba2, a CNN-SSD hybrid that integrates the Mamba2 state-space framework into a U-Net backbone. The method adds an interactive cross-attention branch, self-supervised pretraining on unlabeled CBCT data, and dental-domain priors such as label smoothing, weighted loss for tiny structures, left-right mirroring, and anatomically informed post-processing. On ToothFairy3, U-Mamba2 sets new state-of-the-art mean Dice scores for both Task 1 and Task 2 and demonstrates strong efficiency, with ablations confirming the value of each domain-knowledge component. Overall, the work delivers a scalable, human-in-the-loop segmentation approach with practical implications for clinical diagnosis and surgical planning in dentistry.

Abstract

Cone-Beam Computed Tomography (CBCT) is a widely used 3D imaging technique in dentistry, providing volumetric information about the anatomical structures of jaws and teeth. Accurate segmentation of these anatomies is critical for clinical applications such as diagnosis and surgical planning, but remains time-consuming and challenging. In this paper, we present U-Mamba2, a new neural network architecture designed for multi-anatomy CBCT segmentation in the context of the ToothFairy3 challenge. U-Mamba2 integrates the Mamba2 state space models into the U-Net architecture, enforcing stronger structural constraints for higher efficiency without compromising performance. In addition, we integrate interactive click prompts with cross-attention blocks, pre-train U-Mamba2 using self-supervised learning, and incorporate dental domain knowledge into the model design to address key challenges of dental anatomy segmentation in CBCT. Extensive experiments, including independent tests, demonstrate that U-Mamba2 is both effective and efficient, securing first place in both tasks of the Toothfairy3 challenge. In Task 1, U-Mamba2 achieved a mean Dice of 0.84, HD95 of 38.17 with the held-out test data, with an average inference time of 40.58s. In Task 2, U-Mamba2 achieved the mean Dice of 0.87 and HD95 of 2.15 with the held-out test data. The code is publicly available at https://github.com/zhiqin1998/UMamba2.

U-Mamba2: Scaling State Space Models for Dental Anatomy Segmentation in CBCT

TL;DR

Abstract

U-Mamba2: Scaling State Space Models for Dental Anatomy Segmentation in CBCT

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)