Table of Contents
Fetching ...

3D Dental Model Segmentation with Geometrical Boundary Preserving

Shufan Xi, Zexian Liu, Junlin Chang, Hongyu Wu, Xiaogang Wang, Aimin Hao

TL;DR

CrossTooth tackles the boundary preservation problem in 3D intraoral scan segmentation by combining curvature-aware selective downsampling with cross-modal boundary features derived from multi-view rendered images. A dual-stream architecture fuses a point-based transformer backbone with an image-based segmentation module, projecting dense image features back onto the point cloud to sharpen tooth-gingiva boundaries. The method achieves state-of-the-art performance on the 3DTeethSeg dataset, notably improving overall mIoU to $95.86\%$ and boundary IoU to $82.05\%$, while increasing boundary vertex density by $10\%$–$15\%$ over QEM. These results demonstrate the practical impact of integrating surface curvature priors and image-derived boundary cues for clinically robust intraoral tooth segmentation.

Abstract

3D intraoral scan mesh is widely used in digital dentistry diagnosis, segmenting 3D intraoral scan mesh is a critical preliminary task. Numerous approaches have been devised for precise tooth segmentation. Currently, the deep learning-based methods are capable of the high accuracy segmentation of crown. However, the segmentation accuracy at the junction between the crown and the gum is still below average. Existing down-sampling methods are unable to effectively preserve the geometric details at the junction. To address these problems, we propose CrossTooth, a boundary-preserving segmentation method that combines 3D mesh selective downsampling to retain more vertices at the tooth-gingiva area, along with cross-modal discriminative boundary features extracted from multi-view rendered images, enhancing the geometric representation of the segmentation network. Using a point network as a backbone and incorporating image complementary features, CrossTooth significantly improves segmentation accuracy, as demonstrated by experiments on a public intraoral scan dataset.

3D Dental Model Segmentation with Geometrical Boundary Preserving

TL;DR

CrossTooth tackles the boundary preservation problem in 3D intraoral scan segmentation by combining curvature-aware selective downsampling with cross-modal boundary features derived from multi-view rendered images. A dual-stream architecture fuses a point-based transformer backbone with an image-based segmentation module, projecting dense image features back onto the point cloud to sharpen tooth-gingiva boundaries. The method achieves state-of-the-art performance on the 3DTeethSeg dataset, notably improving overall mIoU to and boundary IoU to , while increasing boundary vertex density by over QEM. These results demonstrate the practical impact of integrating surface curvature priors and image-derived boundary cues for clinically robust intraoral tooth segmentation.

Abstract

3D intraoral scan mesh is widely used in digital dentistry diagnosis, segmenting 3D intraoral scan mesh is a critical preliminary task. Numerous approaches have been devised for precise tooth segmentation. Currently, the deep learning-based methods are capable of the high accuracy segmentation of crown. However, the segmentation accuracy at the junction between the crown and the gum is still below average. Existing down-sampling methods are unable to effectively preserve the geometric details at the junction. To address these problems, we propose CrossTooth, a boundary-preserving segmentation method that combines 3D mesh selective downsampling to retain more vertices at the tooth-gingiva area, along with cross-modal discriminative boundary features extracted from multi-view rendered images, enhancing the geometric representation of the segmentation network. Using a point network as a backbone and incorporating image complementary features, CrossTooth significantly improves segmentation accuracy, as demonstrated by experiments on a public intraoral scan dataset.

Paper Structure

This paper contains 17 sections, 5 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of 3D intraoral scan model. The original intraoral scan consists of points and triangles, it can be visualized with a mean curvature histogram. The more red the color is, the lower the curvature is. Deep learning methods usually take sampled points as input, with blurry boundaries. But we can observe more clear tooth edges in images rendered from intraoral scan than in sampled points, indicated by red and blue boxes in the zoomed views respectively.
  • Figure 2: Architecture of CrossTooth. The point network takes points from the intraoral scan model after selective downsampling as inputs and adopts a multi-scale encoder-decoder structure. The downsample block uses kNN to aggregate features from neighbor points, the transformer block applies a self-attention mechanism to learn long sequence contextual information, and the upsample block fuses features from the encoder and decoder, illustrated by (a) to (c) respectively. Image network takes rendered pictures and concatenates local-global features for downstream tasks. Then, following correspondences between the image and point, image features are projected back onto the point for further fusion illustrated by (d). Common MLP and CNN are used to produce final segmentation masks.
  • Figure 3: Comparison of QEM and selective downsampling method. Our method performs better than QEM in tooth boundaries, as visualized in density histogram (d). Quantitatively, selective downsampling results in 10% to 15% density more than QEM at boundary areas.
  • Figure 4: Visualization of segmentation results, along with respective ground-truth annotations. Important areas are marked with red dotted circles. Our CrossTooth performs better than other methods under all the listed intraoral scan cases.
  • Figure 5: CrossTooth performs better than the other two methods using only image or point features. It demonstrates that image and point features are complementary, they can eliminate each other's wrong segmentation results.