Neural Parametric Gaussians for Monocular Non-Rigid Object Reconstruction
Devikalyan Das, Christopher Wewer, Raza Yunus, Eddy Ilg, Jan Eric Lenssen
TL;DR
This work tackles monocular non-rigid object reconstruction by introducing Neural Parametric Gaussians (NPGs), a two-stage framework that first learns a temporally coherent coarse deformation model and then optimizes 3D Gaussians within local volumes guided by that template. The coarse stage provides strong regularization and correspondences across time, while the Gaussian-based stage captures fine-scale geometry and appearance, enabling high-quality radiance-field renderings for dynamic objects. Across synthetic and real monocular datasets with sparse multi-view cues, NPGs achieve state-of-the-art or competitive results and offer fast rendering thanks to a per-scene Gaussian splatting approach. The method demonstrates robust performance without heavy priors, highlighting the power of neural parametric regularization for high-fidelity novel-view synthesis in challenging dynamic scenarios.
Abstract
Reconstructing dynamic objects from monocular videos is a severely underconstrained and challenging problem, and recent work has approached it in various directions. However, owing to the ill-posed nature of this problem, there has been no solution that can provide consistent, high-quality novel views from camera positions that are significantly different from the training views. In this work, we introduce Neural Parametric Gaussians (NPGs) to take on this challenge by imposing a two-stage approach: first, we fit a low-rank neural deformation model, which then is used as regularization for non-rigid reconstruction in the second stage. The first stage learns the object's deformations such that it preserves consistency in novel views. The second stage obtains high reconstruction quality by optimizing 3D Gaussians that are driven by the coarse model. To this end, we introduce a local 3D Gaussian representation, where temporally shared Gaussians are anchored in and deformed by local oriented volumes. The resulting combined model can be rendered as radiance fields, resulting in high-quality photo-realistic reconstructions of the non-rigidly deforming objects. We demonstrate that NPGs achieve superior results compared to previous works, especially in challenging scenarios with few multi-view cues.
