Table of Contents
Fetching ...

DentalSplat: Dental Occlusion Novel View Synthesis from Sparse Intra-Oral Photographs

Yiyi Miao, Taoyu Wu, Tong Chen, Sihao Li, Ji Jiang, Youpeng Yang, Angelos Stefanidis, Limin Yu, Jionglong Su

TL;DR

DentalSplat provides a novel pipeline for 3D dental occlusion reconstruction from sparse, unposed intra-oral images by combining a prior-guided dense stereo initialization with Scale-Adaptive Pruning and a 3D Gaussian Splatting optimization that incorporates optical-flow constraints and gradient regularization. The method leverages DUSt3R for robust initialization, then refines a Gaussian-based scene representation to achieve high-quality novel view synthesis with only a few views, outperforming state-of-the-art pose-free and sparse-view baselines. Extensive experiments on 956 clinical cases and a 195-video remote-imaging test set demonstrate faster convergence, superior rendering fidelity, and strong robustness to real-world dental imaging artifacts. The approach enables practical remote orthodontic monitoring by producing reliable occlusion visualizations from minimal input within minutes.

Abstract

In orthodontic treatment, particularly within telemedicine contexts, observing patients' dental occlusion from multiple viewpoints facilitates timely clinical decision-making. Recent advances in 3D Gaussian Splatting (3DGS) have shown strong potential in 3D reconstruction and novel view synthesis. However, conventional 3DGS pipelines typically rely on densely captured multi-view inputs and precisely initialized camera poses, limiting their practicality. Orthodontic cases, in contrast, often comprise only three sparse images, specifically, the anterior view and bilateral buccal views, rendering the reconstruction task especially challenging. The extreme sparsity of input views severely degrades reconstruction quality, while the absence of camera pose information further complicates the process. To overcome these limitations, we propose DentalSplat, an effective framework for 3D reconstruction from sparse orthodontic imagery. Our method leverages a prior-guided dense stereo reconstruction model to initialize the point cloud, followed by a scale-adaptive pruning strategy to improve the training efficiency and reconstruction quality of 3DGS. In scenarios with extremely sparse viewpoints, we further incorporate optical flow as a geometric constraint, coupled with gradient regularization, to enhance rendering fidelity. We validate our approach on a large-scale dataset comprising 950 clinical cases and an additional video-based test set of 195 cases designed to simulate real-world remote orthodontic imaging conditions. Experimental results demonstrate that our method effectively handles sparse input scenarios and achieves superior novel view synthesis quality for dental occlusion visualization, outperforming state-of-the-art techniques.

DentalSplat: Dental Occlusion Novel View Synthesis from Sparse Intra-Oral Photographs

TL;DR

DentalSplat provides a novel pipeline for 3D dental occlusion reconstruction from sparse, unposed intra-oral images by combining a prior-guided dense stereo initialization with Scale-Adaptive Pruning and a 3D Gaussian Splatting optimization that incorporates optical-flow constraints and gradient regularization. The method leverages DUSt3R for robust initialization, then refines a Gaussian-based scene representation to achieve high-quality novel view synthesis with only a few views, outperforming state-of-the-art pose-free and sparse-view baselines. Extensive experiments on 956 clinical cases and a 195-video remote-imaging test set demonstrate faster convergence, superior rendering fidelity, and strong robustness to real-world dental imaging artifacts. The approach enables practical remote orthodontic monitoring by producing reliable occlusion visualizations from minimal input within minutes.

Abstract

In orthodontic treatment, particularly within telemedicine contexts, observing patients' dental occlusion from multiple viewpoints facilitates timely clinical decision-making. Recent advances in 3D Gaussian Splatting (3DGS) have shown strong potential in 3D reconstruction and novel view synthesis. However, conventional 3DGS pipelines typically rely on densely captured multi-view inputs and precisely initialized camera poses, limiting their practicality. Orthodontic cases, in contrast, often comprise only three sparse images, specifically, the anterior view and bilateral buccal views, rendering the reconstruction task especially challenging. The extreme sparsity of input views severely degrades reconstruction quality, while the absence of camera pose information further complicates the process. To overcome these limitations, we propose DentalSplat, an effective framework for 3D reconstruction from sparse orthodontic imagery. Our method leverages a prior-guided dense stereo reconstruction model to initialize the point cloud, followed by a scale-adaptive pruning strategy to improve the training efficiency and reconstruction quality of 3DGS. In scenarios with extremely sparse viewpoints, we further incorporate optical flow as a geometric constraint, coupled with gradient regularization, to enhance rendering fidelity. We validate our approach on a large-scale dataset comprising 950 clinical cases and an additional video-based test set of 195 cases designed to simulate real-world remote orthodontic imaging conditions. Experimental results demonstrate that our method effectively handles sparse input scenarios and achieves superior novel view synthesis quality for dental occlusion visualization, outperforming state-of-the-art techniques.

Paper Structure

This paper contains 16 sections, 16 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Overview of DentalSplat. Given a set of sparse and unposed input images, we leverage a stereo-dense reconstruction model to regress the dense point cloud of these input images in the global coordinate system and obtain the corresponding relative camera pose. Subsequently, we apply the SAP strategy to eliminate outlier points, followed by downsampling to obtain a sparse point cloud suitable for 3DGS initialization. During optimization, we incorporate optical flow constraints to ensure geometric consistency and employ gradient constraints to enhance the densification of the 3DGS.
  • Figure 2: Overview of Flow Constraint. At time $t$, each 2D pixel $x_t$ is formed by projecting $K$ overlapping 3D Gaussians under camera pose $\mathcal{T}_t$. At time $t+1$, their motions induce Gaussian flows whose projections are aggregated to estimate the overall optical flow. To jointly optimize the 3D Gaussian primitives $\hat{\mathcal{G}}$ and the camera pose $\mathcal{T}_{t+1}$, we minimize the residual between the estimated optical flow and the ground truth optical flow computed using an off-the-shelf method.
  • Figure 3: Novel View Synthesis Comparisons with 6 views and 9 views input.We qualitatively compare the quality of novel view synthesis and show that our method has better quality with more accurate texture details.
  • Figure 4: Reconstruction Comparisons with 3 views. Visualization of Rendered Images and GT with 3 views Input.
  • Figure 5: Novel View Synthesis Comparisons with 3 views. Due to the lack of ground truth for the 3-view input setting, our analysis focuses on relative performance improvements.