Table of Contents
Fetching ...

Voxel-Mesh Hybrid Representation for Real-Time View Synthesis

Chenhao Zhang, Yongyang Zhou, Lei Zhang

TL;DR

Vosh is a hybrid representation named Vosh, seamlessly combining both voxel and mesh components in hybrid rendering for view synthesis, achieving commendable trade-off between rendering quality and speed, and notably has real-time performance on mobile devices.

Abstract

The neural radiance fields (NeRF) have emerged as a prominent methodology for synthesizing realistic images of novel views. While neural radiance representations based on voxels or mesh individually offer distinct advantages, excelling in either rendering quality or speed, each has limitations in the other aspect. In response, we propose a hybrid representation named Vosh, seamlessly combining both voxel and mesh components in hybrid rendering for view synthesis. Vosh is meticulously crafted by optimizing the voxel grid based on neural rendering, strategically meshing a portion of the volumetric density field to surface. Therefore, it excels in fast rendering scenes with simple geometry and textures through its mesh component, while simultaneously enabling high-quality rendering in intricate regions by leveraging voxel component. The flexibility of Vosh is showcased through the ability to adjust hybrid ratios, providing users the ability to control the balance between rendering quality and speed based on flexible usage. Experimental results demonstrate that our method achieves commendable trade-off between rendering quality and speed, and notably has real-time performance on mobile devices. The interactive web demo and code are available at https://zyyzyy06.github.io/Vosh.

Voxel-Mesh Hybrid Representation for Real-Time View Synthesis

TL;DR

Vosh is a hybrid representation named Vosh, seamlessly combining both voxel and mesh components in hybrid rendering for view synthesis, achieving commendable trade-off between rendering quality and speed, and notably has real-time performance on mobile devices.

Abstract

The neural radiance fields (NeRF) have emerged as a prominent methodology for synthesizing realistic images of novel views. While neural radiance representations based on voxels or mesh individually offer distinct advantages, excelling in either rendering quality or speed, each has limitations in the other aspect. In response, we propose a hybrid representation named Vosh, seamlessly combining both voxel and mesh components in hybrid rendering for view synthesis. Vosh is meticulously crafted by optimizing the voxel grid based on neural rendering, strategically meshing a portion of the volumetric density field to surface. Therefore, it excels in fast rendering scenes with simple geometry and textures through its mesh component, while simultaneously enabling high-quality rendering in intricate regions by leveraging voxel component. The flexibility of Vosh is showcased through the ability to adjust hybrid ratios, providing users the ability to control the balance between rendering quality and speed based on flexible usage. Experimental results demonstrate that our method achieves commendable trade-off between rendering quality and speed, and notably has real-time performance on mobile devices. The interactive web demo and code are available at https://zyyzyy06.github.io/Vosh.
Paper Structure (19 sections, 10 equations, 9 figures, 5 tables)

This paper contains 19 sections, 10 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Left: The hybrid representation Vosh, combines both voxel component (highlighted with pseudo colors) and mesh component (indicated by gray triangles) in hybrid rendering. Right: The proposed method facilitates real-time view synthesis on various mobile devices, e.g., laptops and mobile phones.
  • Figure 2: An overview of the proposed methodology. The training phase starts from grid training for obtaining initial voxels. Then, a portion of initial volumetric density field are meshed. Subsequently, the combination of voxels and mesh are optimized through hybrid rendering and voxel pruning to obtain the final hybrid representation Vosh. The inference phase realizes real-time hybrid rendering with Vosh even on mobile phones.
  • Figure 3: Visualization of conversion loss $E_{convert}$ based on rendering quality. (a) and (b) are the rendering images based on initial voxels and mesh with differentiable surface refinement, respectively. (c) is the gray-scale image of conversion loss $E_{convert}$ calculated based on the rendering quality of both representations. (d) is to filter out the top $10\%$$E_{convert}$ in the space.
  • Figure 4: The proposed hybrid rendering (c) integrates volume rendering (a) and surface rendering (b) for the Vosh.
  • Figure 5: The rendering results and zoomed-in images in outdoor scenes obtained by our method, as well as some SOTA methods based on voxels or mesh representation.
  • ...and 4 more figures