Table of Contents
Fetching ...

BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis

Lior Yariv, Peter Hedman, Christian Reiser, Dor Verbin, Pratul P. Srinivasan, Richard Szeliski, Jonathan T. Barron, Ben Mildenhall

TL;DR

The paper addresses real-time novel view synthesis for large, unbounded scenes, where traditional NeRF-style volumetric methods are accurate but slow on consumer hardware. It introduces BakedSDF, a three-stage pipeline that optimizes a hybrid neural volume–surface representation, bakes it into a high-quality triangle mesh, and pairs it with a lightweight, view-dependent spherical Gaussian appearance model to enable real-time, in-browser rendering. Key contributions include high-quality neural surface reconstruction for unbounded scenes, a robust mesh baking and region-growing workflow, and an efficient SG-based appearance model that supports appearance editing and physics simulation. The approach achieves state-of-the-art speed and accuracy for real-time view synthesis on commodity hardware, with practical benefits for web-based demos and graphics pipelines.

Abstract

We present a method for reconstructing high-quality meshes of large unbounded real-world scenes suitable for photorealistic novel view synthesis. We first optimize a hybrid neural volume-surface scene representation designed to have well-behaved level sets that correspond to surfaces in the scene. We then bake this representation into a high-quality triangle mesh, which we equip with a simple and fast view-dependent appearance model based on spherical Gaussians. Finally, we optimize this baked representation to best reproduce the captured viewpoints, resulting in a model that can leverage accelerated polygon rasterization pipelines for real-time view synthesis on commodity hardware. Our approach outperforms previous scene representations for real-time rendering in terms of accuracy, speed, and power consumption, and produces high quality meshes that enable applications such as appearance editing and physical simulation.

BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis

TL;DR

The paper addresses real-time novel view synthesis for large, unbounded scenes, where traditional NeRF-style volumetric methods are accurate but slow on consumer hardware. It introduces BakedSDF, a three-stage pipeline that optimizes a hybrid neural volume–surface representation, bakes it into a high-quality triangle mesh, and pairs it with a lightweight, view-dependent spherical Gaussian appearance model to enable real-time, in-browser rendering. Key contributions include high-quality neural surface reconstruction for unbounded scenes, a robust mesh baking and region-growing workflow, and an efficient SG-based appearance model that supports appearance editing and physics simulation. The approach achieves state-of-the-art speed and accuracy for real-time view synthesis on commodity hardware, with practical benefits for web-based demos and graphics pipelines.

Abstract

We present a method for reconstructing high-quality meshes of large unbounded real-world scenes suitable for photorealistic novel view synthesis. We first optimize a hybrid neural volume-surface scene representation designed to have well-behaved level sets that correspond to surfaces in the scene. We then bake this representation into a high-quality triangle mesh, which we equip with a simple and fast view-dependent appearance model based on spherical Gaussians. Finally, we optimize this baked representation to best reproduce the captured viewpoints, resulting in a model that can leverage accelerated polygon rasterization pipelines for real-time view synthesis on commodity hardware. Our approach outperforms previous scene representations for real-time rendering in terms of accuracy, speed, and power consumption, and produces high quality meshes that enable applications such as appearance editing and physical simulation.
Paper Structure (24 sections, 7 equations, 5 figures, 3 tables)

This paper contains 24 sections, 7 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An illustration of the three stages of our method. We first reconstruct the scene using a surface-based volumetric representation (Section \ref{['s:surfacerecon']}), then bake it into a high-quality mesh (Section \ref{['s:bakemesh']}), and finally optimize a view-dependent appearance model based on spherical Gaussians (Section \ref{['s:viewdep']}).
  • Figure 2: Our method produces an accurate mesh and decomposes appearance into diffuse and specular color.
  • Figure 3: Test-set renderings (with insets) for our model and the two state-of-the-art real-time baselines we evaluate against, using scenes from the mip-NeRF 360 dataset. Deep Blending hedman2018deep produces posterized renderings when the proxy geometry used as input is incorrect (such as in the background of the bicycle scene) and renderings from MobileNeRF chen2022mobilenerf tend to exhibit aliasing artifacts or oversmoothing.
  • Figure 4: Comparing the meshes produced by our technique with baselines that yield meshes. Our meshes are higher in quality compared to those of COLMAP, MobileNeRF, and Mip-NeRF 360. COLMAP's mesh contains noise, floaters, and irregular object boundaries, MobileNeRF's mesh is a "polygon soup" that may not accurately represent scene geometry, and iso-surfaces from Mip-NeRF 360's density field tend to be noisy and represent reflections with inaccurate geometry.
  • Figure 5: Our framework is based on the neural SDF representation, which struggles to represent semi-transparent objects or thin structures. These limitations can further affect our rendering reconstruction performance.