Table of Contents
Fetching ...

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser, Samuel Rota Bulò, Lorenzo Porzi, Katja Schwarz, Christian Richardt, Michael Zollhöfer, Peter Kontschieder, Andreas Geiger

TL;DR

Volumetric Surfaces introduces k-SDF, a multi-layered, semi-transparent shell representation that binds per-ray sampling to a small, fixed number of points and renders in a sorting-free rasterization pipeline. By learning adaptive shell spacing and employing view-dependent transparency, the method effectively captures fuzzy geometries (e.g., hair) while baking implicit shells into lightweight meshes with neural textures, enabling real-time rendering on mobile hardware. The two-stage pipeline—implicit k-SDF optimization followed by mesh baking and SH texture training—yields high-quality results that outperform purely surface-based methods and approach volume-based baselines in quality, with significantly faster runtimes on general-purpose devices. Ablations show the importance of view-dependent transparency, curvature regularization, and grazing-angle attenuation, while limitations highlight artifacts at grazing angles and challenges with sparse or fully solid scenes. Overall, Volumetric Surfaces offers a practical, real-time solution for rendering fuzzy objects on consumer hardware with a favorable quality-speed trade-off and a clear path for end-to-end asset generation in the future.

Abstract

High-quality view synthesis relies on volume rendering, splatting, or surface rendering. While surface rendering is typically the fastest, it struggles to accurately model fuzzy geometry like hair. In turn, alpha-blending techniques excel at representing fuzzy materials but require an unbounded number of samples per ray (P1). Further overheads are induced by empty space skipping in volume rendering (P2) and sorting input primitives in splatting (P3). We present a novel representation for real-time view synthesis where the (P1) number of sampling locations is small and bounded, (P2) sampling locations are efficiently found via rasterization, and (P3) rendering is sorting-free. We achieve this by representing objects as semi-transparent multi-layer meshes rendered in a fixed order. First, we model surface layers as signed distance function (SDF) shells with optimal spacing learned during training. Then, we bake them as meshes and fit UV textures. Unlike single-surface methods, our multi-layer representation effectively models fuzzy objects. In contrast to volume and splatting-based methods, our approach enables real-time rendering on low-power laptops and smartphones.

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

TL;DR

Volumetric Surfaces introduces k-SDF, a multi-layered, semi-transparent shell representation that binds per-ray sampling to a small, fixed number of points and renders in a sorting-free rasterization pipeline. By learning adaptive shell spacing and employing view-dependent transparency, the method effectively captures fuzzy geometries (e.g., hair) while baking implicit shells into lightweight meshes with neural textures, enabling real-time rendering on mobile hardware. The two-stage pipeline—implicit k-SDF optimization followed by mesh baking and SH texture training—yields high-quality results that outperform purely surface-based methods and approach volume-based baselines in quality, with significantly faster runtimes on general-purpose devices. Ablations show the importance of view-dependent transparency, curvature regularization, and grazing-angle attenuation, while limitations highlight artifacts at grazing angles and challenges with sparse or fully solid scenes. Overall, Volumetric Surfaces offers a practical, real-time solution for rendering fuzzy objects on consumer hardware with a favorable quality-speed trade-off and a clear path for end-to-end asset generation in the future.

Abstract

High-quality view synthesis relies on volume rendering, splatting, or surface rendering. While surface rendering is typically the fastest, it struggles to accurately model fuzzy geometry like hair. In turn, alpha-blending techniques excel at representing fuzzy materials but require an unbounded number of samples per ray (P1). Further overheads are induced by empty space skipping in volume rendering (P2) and sorting input primitives in splatting (P3). We present a novel representation for real-time view synthesis where the (P1) number of sampling locations is small and bounded, (P2) sampling locations are efficiently found via rasterization, and (P3) rendering is sorting-free. We achieve this by representing objects as semi-transparent multi-layer meshes rendered in a fixed order. First, we model surface layers as signed distance function (SDF) shells with optimal spacing learned during training. Then, we bake them as meshes and fit UV textures. Unlike single-surface methods, our multi-layer representation effectively models fuzzy objects. In contrast to volume and splatting-based methods, our approach enables real-time rendering on low-power laptops and smartphones.
Paper Structure (16 sections, 5 equations, 13 figures, 8 tables)

This paper contains 16 sections, 5 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: Sampling strategies: ($a$) volumetric rendering's dense sampling; ($b$) single sampling point, as in surface rendering; ($c$) our method, only sampling the first intersection with each surface.
  • Figure 2: ($a$) High-level architecture of our $k$-SDF network, predicting $k$ distance values as described in \ref{['sec:k-sdf']}. For simplicity of visualization, all offsets are positive. We highlight trainable components. For additional architectural details, refer to the supplementary material. ($b$) 1D example visualization of the output $d_{1:k}$ when evaluating the network at a sample point $\mathbf{x}$ along a ray. Signed distances are shown as solid lines, while $\beta$-controlled integration weights are represented as dotted lines.
  • Figure 3: 1D example visualization of 3-SDF along a ray. Signed distances are shown as solid lines, while $\beta$-controlled integration weights are represented as dotted lines. ($a$) Initialization of the support surfaces as positive and negative constant displacements $\Delta o$ from the main SDF. ($b$) Densities peaked at the end of training. The two support surfaces are displace by trained offsets $(o_1, o_2)$, $d$: $\mathcolor{surfblue}{d_1} = \mathcolor{surfgreen}{d} - o_1$, $\mathcolor{surfred}{d_2} = \mathcolor{surfgreen}{d} + o_2$.
  • Figure 4: Bilinear interpolation in our mixed-resolution textures. Instead of querying the blue 2D point directly, we predict the values at its surrounding texel centers (red points) and bilinearly interpolate them. This anchors the neural texture to a predefined target resolution ($W$, $H$).
  • Figure 5: Frame rate vs. image quality comparison (smartphone results $\diamond$ from \ref{['tab:performance']}). The radius of each circle represents the memory footprint as stored on disk. The vertical dashed line marks the required frame rate for real-time rendering (30 FPS).
  • ...and 8 more figures