Table of Contents
Fetching ...

T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation

Jing Hao, Yonghui Zhu, Lei He, Moyun Liu, James Kit Hon Tsoi, Kuo Feng Hung

TL;DR

T-Mamba proposes a unified framework that introduces frequency-domain features into vision Mamba, coupled with a Gate Selection Unit and shared bi-positional encoding, to address long-range dependency and noise-robustness challenges in 2D and 3D tooth segmentation. The Tim block enhances global context modeling while preserving spatial information, enabling a single architecture to handle both 2D X-ray and 3D CBCT data. Extensive experiments on 3D CBCT and the new TED3 dataset demonstrate state-of-the-art performance with improved efficiency, and ablations confirm the effectiveness of each Tim component. The work also provides TED3, a large public 2D tooth X-ray dataset, to support broader research in dental image analysis.

Abstract

Tooth segmentation is a pivotal step in modern digital dentistry, essential for applications across orthodontic diagnosis and treatment planning. Despite its importance, this process is fraught with challenges due to the high noise and low contrast inherent in 2D and 3D tooth data. Both Convolutional Neural Networks (CNNs) and Transformers has shown promise in medical image segmentation, yet each method has limitations in handling long-range dependencies and computational complexity. To address this issue, this paper introduces T-Mamba, integrating frequency-based features and shared bi-positional encoding into vision mamba to address limitations in efficient global feature modeling. Besides, we design a gate selection unit to integrate two features in spatial domain and one feature in frequency domain adaptively. T-Mamba is the first work to introduce frequency-based features into vision mamba, and its flexibility allows it to process both 2D and 3D tooth data without the need for separate modules. Also, the TED3, a large-scale public tooth 2D dental X-ray dataset, has been presented in this paper. Extensive experiments demonstrate that T-Mamba achieves new SOTA results on a public tooth CBCT dataset and outperforms previous SOTA methods on TED3 dataset. The code and models are publicly available at: https://github.com/isbrycee/T-Mamba.

T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation

TL;DR

T-Mamba proposes a unified framework that introduces frequency-domain features into vision Mamba, coupled with a Gate Selection Unit and shared bi-positional encoding, to address long-range dependency and noise-robustness challenges in 2D and 3D tooth segmentation. The Tim block enhances global context modeling while preserving spatial information, enabling a single architecture to handle both 2D X-ray and 3D CBCT data. Extensive experiments on 3D CBCT and the new TED3 dataset demonstrate state-of-the-art performance with improved efficiency, and ablations confirm the effectiveness of each Tim component. The work also provides TED3, a large public 2D tooth X-ray dataset, to support broader research in dental image analysis.

Abstract

Tooth segmentation is a pivotal step in modern digital dentistry, essential for applications across orthodontic diagnosis and treatment planning. Despite its importance, this process is fraught with challenges due to the high noise and low contrast inherent in 2D and 3D tooth data. Both Convolutional Neural Networks (CNNs) and Transformers has shown promise in medical image segmentation, yet each method has limitations in handling long-range dependencies and computational complexity. To address this issue, this paper introduces T-Mamba, integrating frequency-based features and shared bi-positional encoding into vision mamba to address limitations in efficient global feature modeling. Besides, we design a gate selection unit to integrate two features in spatial domain and one feature in frequency domain adaptively. T-Mamba is the first work to introduce frequency-based features into vision mamba, and its flexibility allows it to process both 2D and 3D tooth data without the need for separate modules. Also, the TED3, a large-scale public tooth 2D dental X-ray dataset, has been presented in this paper. Extensive experiments demonstrate that T-Mamba achieves new SOTA results on a public tooth CBCT dataset and outperforms previous SOTA methods on TED3 dataset. The code and models are publicly available at: https://github.com/isbrycee/T-Mamba.
Paper Structure (22 sections, 8 equations, 10 figures, 7 tables)

This paper contains 22 sections, 8 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: The framework of T-Mamba.
  • Figure 2: The detailed statistics of TED3
  • Figure 3: The examples of TED3-labelled dataset in the test set. The Tooth Mask Ratio indicates the ratio of tooth area to image area. We only visualize the contours of masks for a better view.
  • Figure 4: The 3D CBCT tooth dataset samples.
  • Figure 5: Visual Evaluation of T-Mamba Against State-of-the-Art Methods on 3D CBCT tooth
  • ...and 5 more figures