Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth
Maryam Yousefi, Soodeh Bakhshandeh
TL;DR
The paper tackles dense 3D scene reconstruction from highly sparse depth by introducing a curvature-regularized variational autoencoder (CR-VAE) that uses a discrete Laplacian as the sole geometric regularizer. It demonstrates an 18.1% reduction in test loss over a standard VAE on NYU Depth V2, while maintaining stable training and zero inference overhead, challenging the assumption that more complex, multi-term geometric losses are always beneficial. The authors show that the discrete Laplacian yields stable gradients, effective noise suppression, and scale alignment for 32^3 voxel grids, achieving faster convergence and robust performance across seeds. They also discuss limitations, potential extensions, and practical implications for robotics, AR, and safety-critical systems, with code available for reproducibility.
Abstract
When depth sensors provide only 5% of needed measurements, reconstructing complete 3D scenes becomes difficult. Autonomous vehicles and robots cannot tolerate the geometric errors that sparse reconstruction introduces. We propose curvature regularization through a discrete Laplacian operator, achieving 18.1% better reconstruction accuracy than standard variational autoencoders. Our contribution challenges an implicit assumption in geometric deep learning: that combining multiple geometric constraints improves performance. A single well-designed regularization term not only matches but exceeds the effectiveness of complex multi-term formulations. The discrete Laplacian offers stable gradients and noise suppression with just 15% training overhead and zero inference cost. Code and models are available at https://github.com/Maryousefi/GeoVAE-3D.
