Table of Contents
Fetching ...

MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images

Yuxuan Shi, Jun Xu, Dinggang Shen

TL;DR

This work tackles metal artifacts in dental CBCT by introducing MARformer, a light-weight Transformer that performs metal artifact reduction with a Dimension-Reduced Self-Attention mechanism operating along the channel dimension and a Patch-wise Perceptive FFN for local detail restoration. The model adopts a U-Net–like architecture with three encoder–decoder levels and a bottleneck, using DRSA to capture global correlations at reduced computational cost and P2FFN to recover fine structures; the final output adds a learned residual to the degraded input. On a large dataset with synthetic and real MA, MARformer-L achieves state-of-the-art PSNR/SSIM with far fewer parameters and FLOPs than competing methods, while MARformer-T offers a highly efficient alternative with comparable restoration quality. Ablation studies validate the effectiveness of channel-wise attention and the chosen kernel sizes, supporting practical deployment for improved dental diagnosis and downstream tooth segmentation.

Abstract

Cone Beam Computed Tomography (CBCT) plays a key role in dental diagnosis and surgery. However, the metal teeth implants could bring annoying metal artifacts during the CBCT imaging process, interfering diagnosis and downstream processing such as tooth segmentation. In this paper, we develop an efficient Transformer to perform metal artifacts reduction (MAR) from dental CBCT images. The proposed MAR Transformer (MARformer) reduces computation complexity in the multihead self-attention by a new Dimension-Reduced Self-Attention (DRSA) module, based on that the CBCT images have globally similar structure. A Patch-wise Perceptive Feed Forward Network (P2FFN) is also proposed to perceive local image information for fine-grained restoration. Experimental results on CBCT images with synthetic and real-world metal artifacts show that our MARformer is efficient and outperforms previous MAR methods and two restoration Transformers.

MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images

TL;DR

This work tackles metal artifacts in dental CBCT by introducing MARformer, a light-weight Transformer that performs metal artifact reduction with a Dimension-Reduced Self-Attention mechanism operating along the channel dimension and a Patch-wise Perceptive FFN for local detail restoration. The model adopts a U-Net–like architecture with three encoder–decoder levels and a bottleneck, using DRSA to capture global correlations at reduced computational cost and P2FFN to recover fine structures; the final output adds a learned residual to the degraded input. On a large dataset with synthetic and real MA, MARformer-L achieves state-of-the-art PSNR/SSIM with far fewer parameters and FLOPs than competing methods, while MARformer-T offers a highly efficient alternative with comparable restoration quality. Ablation studies validate the effectiveness of channel-wise attention and the chosen kernel sizes, supporting practical deployment for improved dental diagnosis and downstream tooth segmentation.

Abstract

Cone Beam Computed Tomography (CBCT) plays a key role in dental diagnosis and surgery. However, the metal teeth implants could bring annoying metal artifacts during the CBCT imaging process, interfering diagnosis and downstream processing such as tooth segmentation. In this paper, we develop an efficient Transformer to perform metal artifacts reduction (MAR) from dental CBCT images. The proposed MAR Transformer (MARformer) reduces computation complexity in the multihead self-attention by a new Dimension-Reduced Self-Attention (DRSA) module, based on that the CBCT images have globally similar structure. A Patch-wise Perceptive Feed Forward Network (P2FFN) is also proposed to perceive local image information for fine-grained restoration. Experimental results on CBCT images with synthetic and real-world metal artifacts show that our MARformer is efficient and outperforms previous MAR methods and two restoration Transformers.
Paper Structure (14 sections, 4 figures, 3 tables)

This paper contains 14 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Segmentation results by Poolformer yu2022metaformer. (a) A CBCT image without metal artifacts. (b) Segmentation mask of Poolformer on (a). (c) The image (a) with synthetic metal artifacts. (d) Segmentation mask of Poolformer on (c).
  • Figure 2: Illustration of the proposed MARformer (a), the Transformer Block in our MARformer (b), the proposed Dimension-Reduced Self-Attention (DRSA) module (c), and the proposed Patch-wise Perceptive Feed Forward Network (P2FFN) (d).
  • Figure 3: Visual comparison of MAR images by different methods on synthetic MA image. The PSNR (dB)/SSIM results are reported below each image for reference.
  • Figure 4: Comparison of MAR images by different methods on real-world MA image. The last image is the metal mask by selecting the pixel area over 2800HU in the MA image.