Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis
Christian Reiser, Stephan Garbin, Pratul P. Srinivasan, Dor Verbin, Richard Szeliski, Ben Mildenhall, Jonathan T. Barron, Peter Hedman, Andreas Geiger
TL;DR
This work addresses the challenge of reproducing thin geometric detail in view synthesis while maintaining real-time, mesh-based rendering. It introduces Binary Opacity Grids: a high-resolution discrete opacity grid predicted by an MLP, trained with supersampling, a binary entropy loss, and anti-aliased training to converge to hard surfaces. After training, a fusion-based meshing pipeline converts the opacity grid into a clean triangle mesh, followed by simplification and a lightweight view-dependent appearance model built with triplanes and a low-resolution voxel grid. Test-time anti-aliasing with TAA enables real-time rendering on mobile devices, and experiments show improved reconstruction of thin structures and competitive quality with strong baselines like BakedSDF while achieving faster rendering. Overall, the method narrows the gap between surface-based and volume-based view synthesis and delivers compact, fast meshes suitable for mobile applications.
Abstract
While surface-based view synthesis algorithms are appealing due to their low computational requirements, they often struggle to reproduce thin structures. In contrast, more expensive methods that model the scene's geometry as a volumetric density field (e.g. NeRF) excel at reconstructing fine geometric detail. However, density fields often represent geometry in a "fuzzy" manner, which hinders exact localization of the surface. In this work, we modify density fields to encourage them to converge towards surfaces, without compromising their ability to reconstruct thin structures. First, we employ a discrete opacity grid representation instead of a continuous density field, which allows opacity values to discontinuously transition from zero to one at the surface. Second, we anti-alias by casting multiple rays per pixel, which allows occlusion boundaries and subpixel structures to be modelled without using semi-transparent voxels. Third, we minimize the binary entropy of the opacity values, which facilitates the extraction of surface geometry by encouraging opacity values to binarize towards the end of training. Lastly, we develop a fusion-based meshing strategy followed by mesh simplification and appearance model fitting. The compact meshes produced by our model can be rendered in real-time on mobile devices and achieve significantly higher view synthesis quality compared to existing mesh-based approaches.
