Table of Contents
Fetching ...

Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth

Maryam Yousefi, Soodeh Bakhshandeh

TL;DR

The paper tackles dense 3D scene reconstruction from highly sparse depth by introducing a curvature-regularized variational autoencoder (CR-VAE) that uses a discrete Laplacian as the sole geometric regularizer. It demonstrates an 18.1% reduction in test loss over a standard VAE on NYU Depth V2, while maintaining stable training and zero inference overhead, challenging the assumption that more complex, multi-term geometric losses are always beneficial. The authors show that the discrete Laplacian yields stable gradients, effective noise suppression, and scale alignment for 32^3 voxel grids, achieving faster convergence and robust performance across seeds. They also discuss limitations, potential extensions, and practical implications for robotics, AR, and safety-critical systems, with code available for reproducibility.

Abstract

When depth sensors provide only 5% of needed measurements, reconstructing complete 3D scenes becomes difficult. Autonomous vehicles and robots cannot tolerate the geometric errors that sparse reconstruction introduces. We propose curvature regularization through a discrete Laplacian operator, achieving 18.1% better reconstruction accuracy than standard variational autoencoders. Our contribution challenges an implicit assumption in geometric deep learning: that combining multiple geometric constraints improves performance. A single well-designed regularization term not only matches but exceeds the effectiveness of complex multi-term formulations. The discrete Laplacian offers stable gradients and noise suppression with just 15% training overhead and zero inference cost. Code and models are available at https://github.com/Maryousefi/GeoVAE-3D.

Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth

TL;DR

The paper tackles dense 3D scene reconstruction from highly sparse depth by introducing a curvature-regularized variational autoencoder (CR-VAE) that uses a discrete Laplacian as the sole geometric regularizer. It demonstrates an 18.1% reduction in test loss over a standard VAE on NYU Depth V2, while maintaining stable training and zero inference overhead, challenging the assumption that more complex, multi-term geometric losses are always beneficial. The authors show that the discrete Laplacian yields stable gradients, effective noise suppression, and scale alignment for 32^3 voxel grids, achieving faster convergence and robust performance across seeds. They also discuss limitations, potential extensions, and practical implications for robotics, AR, and safety-critical systems, with code available for reproducibility.

Abstract

When depth sensors provide only 5% of needed measurements, reconstructing complete 3D scenes becomes difficult. Autonomous vehicles and robots cannot tolerate the geometric errors that sparse reconstruction introduces. We propose curvature regularization through a discrete Laplacian operator, achieving 18.1% better reconstruction accuracy than standard variational autoencoders. Our contribution challenges an implicit assumption in geometric deep learning: that combining multiple geometric constraints improves performance. A single well-designed regularization term not only matches but exceeds the effectiveness of complex multi-term formulations. The discrete Laplacian offers stable gradients and noise suppression with just 15% training overhead and zero inference cost. Code and models are available at https://github.com/Maryousefi/GeoVAE-3D.

Paper Structure

This paper contains 17 sections, 6 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Our curvature-regularized VAE (CR-VAE) improves reconstruction by 18.1% over baseline. Contrary to conventional wisdom, the single curvature term outperforms multi-term geometric losses. Statistical testing across three random seeds confirms significance at $p < 0.001$.
  • Figure 2: Visual comparison of different geometric regularization strategies. Our single curvature term (blue) substantially outperforms the baseline (green) with exceptional stability. Multi-term alternatives struggle with optimization, exhibiting high variance. Error bars show $\pm 1$ standard deviation across three runs.
  • Figure 3: Training dynamics reveal efficient convergence and seamless component integration. Panel (a) shows faster convergence than baseline without optimization instabilities. Panel (b) demonstrates that our objective weights ($\beta=0.001$, $\lambda_c=0.02$) create harmony among competing terms rather than conflicts.