Table of Contents
Fetching ...

Processing and Segmentation of Human Teeth from 2D Images using Weakly Supervised Learning

Tomáš Kunzo, Viktor Kocur, Lukáš Gajdošech, Martin Madaras

TL;DR

The paper tackles teeth segmentation under limited annotation by proposing a weakly supervised approach that leverages keypoint heatmaps and intermediate feature maps from a teeth keypoint detector. A CenterNet-based keypoint detector trained on the TriDental dataset provides the guidance for segmentation through multi-scale feature fusion and postprocessing steps including CRF and watershed, enabling masks without explicit segmentation labels. Experiments on TriDental show improvements over baselines and demonstrate the method's robustness across views, with Segment Anything enhanced by the learned keypoints further validating the approach. The work offers a cost-effective, adaptable solution for dental imaging and sets the stage for broader adoption and real-time applications in clinical settings.

Abstract

Teeth segmentation is an essential task in dental image analysis for accurate diagnosis and treatment planning. While supervised deep learning methods can be utilized for teeth segmentation, they often require extensive manual annotation of segmentation masks, which is time-consuming and costly. In this research, we propose a weakly supervised approach for teeth segmentation that reduces the need for manual annotation. Our method utilizes the output heatmaps and intermediate feature maps from a keypoint detection network to guide the segmentation process. We introduce the TriDental dataset, consisting of 3000 oral cavity images annotated with teeth keypoints, to train a teeth keypoint detection network. We combine feature maps from different layers of the keypoint detection network, enabling accurate teeth segmentation without explicit segmentation annotations. The detected keypoints are also used for further refinement of the segmentation masks. Experimental results on the TriDental dataset demonstrate the superiority of our approach in terms of accuracy and robustness compared to state-of-the-art segmentation methods. Our method offers a cost-effective and efficient solution for teeth segmentation in real-world dental applications, eliminating the need for extensive manual annotation efforts.

Processing and Segmentation of Human Teeth from 2D Images using Weakly Supervised Learning

TL;DR

The paper tackles teeth segmentation under limited annotation by proposing a weakly supervised approach that leverages keypoint heatmaps and intermediate feature maps from a teeth keypoint detector. A CenterNet-based keypoint detector trained on the TriDental dataset provides the guidance for segmentation through multi-scale feature fusion and postprocessing steps including CRF and watershed, enabling masks without explicit segmentation labels. Experiments on TriDental show improvements over baselines and demonstrate the method's robustness across views, with Segment Anything enhanced by the learned keypoints further validating the approach. The work offers a cost-effective, adaptable solution for dental imaging and sets the stage for broader adoption and real-time applications in clinical settings.

Abstract

Teeth segmentation is an essential task in dental image analysis for accurate diagnosis and treatment planning. While supervised deep learning methods can be utilized for teeth segmentation, they often require extensive manual annotation of segmentation masks, which is time-consuming and costly. In this research, we propose a weakly supervised approach for teeth segmentation that reduces the need for manual annotation. Our method utilizes the output heatmaps and intermediate feature maps from a keypoint detection network to guide the segmentation process. We introduce the TriDental dataset, consisting of 3000 oral cavity images annotated with teeth keypoints, to train a teeth keypoint detection network. We combine feature maps from different layers of the keypoint detection network, enabling accurate teeth segmentation without explicit segmentation annotations. The detected keypoints are also used for further refinement of the segmentation masks. Experimental results on the TriDental dataset demonstrate the superiority of our approach in terms of accuracy and robustness compared to state-of-the-art segmentation methods. Our method offers a cost-effective and efficient solution for teeth segmentation in real-world dental applications, eliminating the need for extensive manual annotation efforts.
Paper Structure (17 sections, 6 figures, 2 tables)

This paper contains 17 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Example of three different views of the same oral cavity representing a single sequence in the TriDental dataset.
  • Figure 2: Example of our keypoint prediction with network with $512 \times 512$ output heatmap trained on all views. Dark purple markers represent ground-truth keypoints, turquoise markers represent predictions. The image on the right shows two false positive detections.
  • Figure 3: Precision, recall, and f1 score are displayed on the y-axis as the distance threshold is varied on the x-axis. We show the results on the test set of TriDental for the models trained on all images.
  • Figure 4: We combine several feature maps from the ResNet18 resnet backbone of the trained keypoint detection network into a single combined feature map which we further process to obtain segmentation masks. The obtain the combined feature map, feature maps from three different layers of the model are averaged, added together, and upsampled to the resolution of the input image.
  • Figure 5: Post-processing pipeline of our method. The combined feature map is postprocessed using Otsu's thresholding otsu, morphological operations, CRF CRF and a modified version of the watershed algorithm using the output heatmaps of our keypoint detection method to obtain the final segmentation masks.
  • ...and 1 more figures