Variational Autoencoding of Dental Point Clouds
Johan Ziruo Ye, Thomas Ørkild, Peter Lempel Søndergaard, Søren Hauberg
TL;DR
This paper addresses the challenge of probabilistic 3D modeling for dental point clouds by introducing VF-Net, a fully probabilistic variational autoencoder that establishes a one-to-one correspondence between input and output points through per-point encodings projected onto a learnable 2D plane $\mathcal{G}=[-1,1]^2$. By replacing Chamfer-distance–based reconstruction with a likelihood-based objective and a normalizing-flow prior over the latent, VF-Net enables tractable probabilistic evaluation and efficient mesh generation, shape completion, and representation learning. The approach yields a new dataset (FDI 16) of 7,732 dental tooth meshes/point clouds and demonstrates state-of-the-art generative performance and lower reconstruction errors on dental data, along with robust latent representations and interpolation capabilities. The work offers practical impact for digital dentistry, including edge-deployable mesh generation and reliable shape completion, while acknowledging ethical considerations around potential misuse of realistic synthetic dental data.
Abstract
Digital dentistry has made significant advancements, yet numerous challenges remain. This paper introduces the FDI 16 dataset, an extensive collection of tooth meshes and point clouds. Additionally, we present a novel approach: Variational FoldingNet (VF-Net), a fully probabilistic variational autoencoder for point clouds. Notably, prior latent variable models for point clouds lack a one-to-one correspondence between input and output points. Instead, they rely on optimizing Chamfer distances, a metric that lacks a normalized distributional counterpart, rendering it unsuitable for probabilistic modeling. We replace the explicit minimization of Chamfer distances with a suitable encoder, increasing computational efficiency while simplifying the probabilistic extension. This allows for straightforward application in various tasks, including mesh generation, shape completion, and representation learning. Empirically, we provide evidence of lower reconstruction error in dental reconstruction and interpolation, showcasing state-of-the-art performance in dental sample generation while identifying valuable latent representations
