Table of Contents
Fetching ...

Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis

Christian Reiser, Stephan Garbin, Pratul P. Srinivasan, Dor Verbin, Richard Szeliski, Ben Mildenhall, Jonathan T. Barron, Peter Hedman, Andreas Geiger

TL;DR

This work addresses the challenge of reproducing thin geometric detail in view synthesis while maintaining real-time, mesh-based rendering. It introduces Binary Opacity Grids: a high-resolution discrete opacity grid predicted by an MLP, trained with supersampling, a binary entropy loss, and anti-aliased training to converge to hard surfaces. After training, a fusion-based meshing pipeline converts the opacity grid into a clean triangle mesh, followed by simplification and a lightweight view-dependent appearance model built with triplanes and a low-resolution voxel grid. Test-time anti-aliasing with TAA enables real-time rendering on mobile devices, and experiments show improved reconstruction of thin structures and competitive quality with strong baselines like BakedSDF while achieving faster rendering. Overall, the method narrows the gap between surface-based and volume-based view synthesis and delivers compact, fast meshes suitable for mobile applications.

Abstract

While surface-based view synthesis algorithms are appealing due to their low computational requirements, they often struggle to reproduce thin structures. In contrast, more expensive methods that model the scene's geometry as a volumetric density field (e.g. NeRF) excel at reconstructing fine geometric detail. However, density fields often represent geometry in a "fuzzy" manner, which hinders exact localization of the surface. In this work, we modify density fields to encourage them to converge towards surfaces, without compromising their ability to reconstruct thin structures. First, we employ a discrete opacity grid representation instead of a continuous density field, which allows opacity values to discontinuously transition from zero to one at the surface. Second, we anti-alias by casting multiple rays per pixel, which allows occlusion boundaries and subpixel structures to be modelled without using semi-transparent voxels. Third, we minimize the binary entropy of the opacity values, which facilitates the extraction of surface geometry by encouraging opacity values to binarize towards the end of training. Lastly, we develop a fusion-based meshing strategy followed by mesh simplification and appearance model fitting. The compact meshes produced by our model can be rendered in real-time on mobile devices and achieve significantly higher view synthesis quality compared to existing mesh-based approaches.

Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis

TL;DR

This work addresses the challenge of reproducing thin geometric detail in view synthesis while maintaining real-time, mesh-based rendering. It introduces Binary Opacity Grids: a high-resolution discrete opacity grid predicted by an MLP, trained with supersampling, a binary entropy loss, and anti-aliased training to converge to hard surfaces. After training, a fusion-based meshing pipeline converts the opacity grid into a clean triangle mesh, followed by simplification and a lightweight view-dependent appearance model built with triplanes and a low-resolution voxel grid. Test-time anti-aliasing with TAA enables real-time rendering on mobile devices, and experiments show improved reconstruction of thin structures and competitive quality with strong baselines like BakedSDF while achieving faster rendering. Overall, the method narrows the gap between surface-based and volume-based view synthesis and delivers compact, fast meshes suitable for mobile applications.

Abstract

While surface-based view synthesis algorithms are appealing due to their low computational requirements, they often struggle to reproduce thin structures. In contrast, more expensive methods that model the scene's geometry as a volumetric density field (e.g. NeRF) excel at reconstructing fine geometric detail. However, density fields often represent geometry in a "fuzzy" manner, which hinders exact localization of the surface. In this work, we modify density fields to encourage them to converge towards surfaces, without compromising their ability to reconstruct thin structures. First, we employ a discrete opacity grid representation instead of a continuous density field, which allows opacity values to discontinuously transition from zero to one at the surface. Second, we anti-alias by casting multiple rays per pixel, which allows occlusion boundaries and subpixel structures to be modelled without using semi-transparent voxels. Third, we minimize the binary entropy of the opacity values, which facilitates the extraction of surface geometry by encouraging opacity values to binarize towards the end of training. Lastly, we develop a fusion-based meshing strategy followed by mesh simplification and appearance model fitting. The compact meshes produced by our model can be rendered in real-time on mobile devices and achieve significantly higher view synthesis quality compared to existing mesh-based approaches.
Paper Structure (25 sections, 6 equations, 6 figures, 8 tables)

This paper contains 25 sections, 6 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: We visualize the volume rendering weights for rays corresponding to a row of pixels (left, in pink). Density fields such as Zip-NeRF Zip-NeRF tend to represent hard surfaces as semi-transparent volumes despite their use of surface-promoting regularizers. In contrast, our opacity grid converges to a hard surface. Note that, since each pixel column visualizes the volume rendering weights of a single ray, gaps in this visualization do not indicate the presence of holes in the underlying representation.
  • Figure 2: Comparison between different meshing strategies. The bottom left image shows a depth map rendered from a mesh that was obtained by applying the meshing strategy from BakedSDF to our representation. Geometry is instantiated at all visible voxels with an opacity value of $1$ that are sampled by the proposal MLP in any training view. This leads to numerous floating artifacts, as infrequently sampled voxels in free space are severely underconstrained by the training loss. The bottom right shows that these underconstrained voxels can be effectively filtered by running volumetric fusion on depth maps rendered from our model. This filtering step also fully preserves thin structures, as can be seen in the top image.
  • Figure 3: Comparison between different representations for mesh appearance. Replacing vertex attributes with a grid representation leads to sharper textures. There is almost no difference between "voxel grid" and the cheaper alternative "triplane + voxel".
  • Figure 4: Our method narrows the quality gap between surface-based and volume-based methods when it comes to the reconstruction of thin structures.
  • Figure 5: Our method retains more geometric detail than BakedSDF. Bottom rows are visualizations of depth maps, for which we do not have ground-truth.
  • ...and 1 more figures