Table of Contents
Fetching ...

Dental3R: Geometry-Aware Pairing for Intraoral 3D Reconstruction from Sparse-View Photographs

Yiyi Miao, Taoyu Wu, Tong Chen, Ji Jiang, Zhe Tang, Zhengyong Jiang, Angelos Stefanidis, Limin Yu, Jionglong Su

TL;DR

This work tackles the challenge of reconstructing accurate 3D dental occlusions from sparse, unposed intraoral photographs common in remote orthodontics. It introduces Dental3R, a pose-free, graph-guided pipeline that first builds a Geometry-Aware Pairing Strategy (GAPS) to select a compact set of high-value image pairs and then trains a 3D Gaussian Splatting representation with a wavelet-based fidelity constraint. The key innovations are the GAPS-based subgraph construction, integration with DUSt3R for initial geometry, and a two-level wavelet loss that preserves enamel edges while suppressing high-frequency artifacts under sparse views. Experiments on 950 clinical cases and 195 video sequences demonstrate superior novel-view synthesis quality and reduced memory usage compared with state-of-the-art methods, enabling more accessible and reliable remote tele-orthodontic visualization and planning.

Abstract

Intraoral 3D reconstruction is fundamental to digital orthodontics, yet conventional methods like intraoral scanning are inaccessible for remote tele-orthodontics, which typically relies on sparse smartphone imagery. While 3D Gaussian Splatting (3DGS) shows promise for novel view synthesis, its application to the standard clinical triad of unposed anterior and bilateral buccal photographs is challenging. The large view baselines, inconsistent illumination, and specular surfaces common in intraoral settings can destabilize simultaneous pose and geometry estimation. Furthermore, sparse-view photometric supervision often induces a frequency bias, leading to over-smoothed reconstructions that lose critical diagnostic details. To address these limitations, we propose \textbf{Dental3R}, a pose-free, graph-guided pipeline for robust, high-fidelity reconstruction from sparse intraoral photographs. Our method first constructs a Geometry-Aware Pairing Strategy (GAPS) to intelligently select a compact subgraph of high-value image pairs. The GAPS focuses on correspondence matching, thereby improving the stability of the geometry initialization and reducing memory usage. Building on the recovered poses and point cloud, we train the 3DGS model with a wavelet-regularized objective. By enforcing band-limited fidelity using a discrete wavelet transform, our approach preserves fine enamel boundaries and interproximal edges while suppressing high-frequency artifacts. We validate our approach on a large-scale dataset of 950 clinical cases and an additional video-based test set of 195 cases. Experimental results demonstrate that Dental3R effectively handles sparse, unposed inputs and achieves superior novel view synthesis quality for dental occlusion visualization, outperforming state-of-the-art methods.

Dental3R: Geometry-Aware Pairing for Intraoral 3D Reconstruction from Sparse-View Photographs

TL;DR

This work tackles the challenge of reconstructing accurate 3D dental occlusions from sparse, unposed intraoral photographs common in remote orthodontics. It introduces Dental3R, a pose-free, graph-guided pipeline that first builds a Geometry-Aware Pairing Strategy (GAPS) to select a compact set of high-value image pairs and then trains a 3D Gaussian Splatting representation with a wavelet-based fidelity constraint. The key innovations are the GAPS-based subgraph construction, integration with DUSt3R for initial geometry, and a two-level wavelet loss that preserves enamel edges while suppressing high-frequency artifacts under sparse views. Experiments on 950 clinical cases and 195 video sequences demonstrate superior novel-view synthesis quality and reduced memory usage compared with state-of-the-art methods, enabling more accessible and reliable remote tele-orthodontic visualization and planning.

Abstract

Intraoral 3D reconstruction is fundamental to digital orthodontics, yet conventional methods like intraoral scanning are inaccessible for remote tele-orthodontics, which typically relies on sparse smartphone imagery. While 3D Gaussian Splatting (3DGS) shows promise for novel view synthesis, its application to the standard clinical triad of unposed anterior and bilateral buccal photographs is challenging. The large view baselines, inconsistent illumination, and specular surfaces common in intraoral settings can destabilize simultaneous pose and geometry estimation. Furthermore, sparse-view photometric supervision often induces a frequency bias, leading to over-smoothed reconstructions that lose critical diagnostic details. To address these limitations, we propose \textbf{Dental3R}, a pose-free, graph-guided pipeline for robust, high-fidelity reconstruction from sparse intraoral photographs. Our method first constructs a Geometry-Aware Pairing Strategy (GAPS) to intelligently select a compact subgraph of high-value image pairs. The GAPS focuses on correspondence matching, thereby improving the stability of the geometry initialization and reducing memory usage. Building on the recovered poses and point cloud, we train the 3DGS model with a wavelet-regularized objective. By enforcing band-limited fidelity using a discrete wavelet transform, our approach preserves fine enamel boundaries and interproximal edges while suppressing high-frequency artifacts. We validate our approach on a large-scale dataset of 950 clinical cases and an additional video-based test set of 195 cases. Experimental results demonstrate that Dental3R effectively handles sparse, unposed inputs and achieves superior novel view synthesis quality for dental occlusion visualization, outperforming state-of-the-art methods.

Paper Structure

This paper contains 25 sections, 12 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Overview of Dental3R. Given a set of sparse and unposed input images, we first employ our GAPS strategy to generate image pairs. Subsequently, we leverage a stereo-dense reconstruction model to regress a dense point cloud in a global coordinate system, while concurrently obtaining the corresponding relative camera poses. The resulting point cloud is then used to initialize the 3D Gaussians. During the optimization process, we incorporate wavelet constraints to ensure geometric consistency and frequency details.
  • Figure 2: Novel View Synthesis Comparisons with 6 views and 9 views input. We qualitatively compare the quality of novel view synthesis with 3DGS kerbl20233dgs, CF-3DGS cf3dgs, and InstantSplat instantsplat, and show that our method achieves better quality and more accurate texture details.
  • Figure 3: Novel View Synthesis Results with Different Pair Strategy. We perform a qualitative comparison against the complete and oneref graph strategies from InstantSplat instantsplat, as well as the cosine graph strategy from EasySplat gao2025easysplat. The results demonstrate that our proposed GAPS strategy achieves novel view synthesis performance competitive with the exhaustive complete strategy. Furthermore, GAPS surpasses the other two sparse methods (oneref and cosine), yielding superior rendering quality and more accurate textural details across various input views.