Table of Contents
Fetching ...

HVOFusion: Incremental Mesh Reconstruction Using Hybrid Voxel Octree

Shaofan Liu, Junbo Chen, Jianke Zhu

TL;DR

This work tackles online, large-scale scene reconstruction by balancing speed, memory, and surface quality. It introduces HVOFusion, which fuses an octree backbone with hierarchical voxels in leaf nodes to store explicit mesh faces alongside an implicit surface, enabling incremental partial meshes $\mathcal{M}_t$ that are fused into a global mesh $\mathcal{M}$. The method combines a point-based refinement (Chamfer distance to the input point cloud) with a shading-based refinement (differentiable shading using four-component spherical harmonics) to jointly optimize geometry and vertex colors. Experiments on Replica, ScanNet++, and LiDAR datasets show improved geometric detail, realistic coloring, and efficient runtime compared to multiple baselines, demonstrating practical benefits for real-time robotic perception and mapping.

Abstract

Incremental scene reconstruction is essential to the navigation in robotics. Most of the conventional methods typically make use of either TSDF (truncated signed distance functions) volume or neural networks to implicitly represent the surface. Due to the voxel representation or involving with time-consuming sampling, they have difficulty in balancing speed, memory storage, and surface quality. In this paper, we propose a novel hybrid voxel-octree approach to effectively fuse octree with voxel structures so that we can take advantage of both implicit surface and explicit triangular mesh representation. Such sparse structure preserves triangular faces in the leaf nodes and produces partial meshes sequentially for incremental reconstruction. This storage scheme allows us to naturally optimize the mesh in explicit 3D space to achieve higher surface quality. We iteratively deform the mesh towards the target and recovers vertex colors by optimizing a shading model. Experimental results on several datasets show that our proposed approach is capable of quickly and accurately reconstructing a scene with realistic colors.

HVOFusion: Incremental Mesh Reconstruction Using Hybrid Voxel Octree

TL;DR

This work tackles online, large-scale scene reconstruction by balancing speed, memory, and surface quality. It introduces HVOFusion, which fuses an octree backbone with hierarchical voxels in leaf nodes to store explicit mesh faces alongside an implicit surface, enabling incremental partial meshes that are fused into a global mesh . The method combines a point-based refinement (Chamfer distance to the input point cloud) with a shading-based refinement (differentiable shading using four-component spherical harmonics) to jointly optimize geometry and vertex colors. Experiments on Replica, ScanNet++, and LiDAR datasets show improved geometric detail, realistic coloring, and efficient runtime compared to multiple baselines, demonstrating practical benefits for real-time robotic perception and mapping.

Abstract

Incremental scene reconstruction is essential to the navigation in robotics. Most of the conventional methods typically make use of either TSDF (truncated signed distance functions) volume or neural networks to implicitly represent the surface. Due to the voxel representation or involving with time-consuming sampling, they have difficulty in balancing speed, memory storage, and surface quality. In this paper, we propose a novel hybrid voxel-octree approach to effectively fuse octree with voxel structures so that we can take advantage of both implicit surface and explicit triangular mesh representation. Such sparse structure preserves triangular faces in the leaf nodes and produces partial meshes sequentially for incremental reconstruction. This storage scheme allows us to naturally optimize the mesh in explicit 3D space to achieve higher surface quality. We iteratively deform the mesh towards the target and recovers vertex colors by optimizing a shading model. Experimental results on several datasets show that our proposed approach is capable of quickly and accurately reconstructing a scene with realistic colors.
Paper Structure (26 sections, 9 equations, 10 figures, 5 tables)

This paper contains 26 sections, 9 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: We reconstruct the scene in an incremental manner. The partial mesh $\mathcal{M}_t$ output at time $t$ is constructed by the hybrid voxel-octree and deformed in the refinement branch. We merge all the partial meshes to get the final mesh $\mathcal{M}$.
  • Figure 2: The architecture of our proposed incremental mesh reconstruction approach. In the incremental reconstruction pipeline, each frame of the point cloud is downsampled and inserted into the hybrid voxel-octree, resulting in a sparse voxel structure composed of leaf nodes. The hybrid voxel-octree is then used to extract partial mesh ${\mathcal{M}_t'}$. ${\mathcal{M}_t'}$ is optimized in terms of its topology, colors, and vertex positions through the point-based and shading-based refinement branch. The refined partial mesh ${\mathcal{M}_t}$ is finally fused into the global mesh.
  • Figure 3: Example of the structure and node properties of the hybrid voxel-octree. Each node is indicated by a binary Morton code representing its position.
  • Figure 4: Example of voxel meshing.$c_n$ is the voxel's corner related to voxel level. $p_n$ and $q_n$ are vertices of triangular face within the voxel. a). The positions of $p_n$ and $q_n$ are consistent when adjacent voxels are at the same level b). To ensure consistency, $p_n$ is modified to $p'_n$ when adjacent voxels are at different levels.
  • Figure 5: Reconstruction Result for Replica Dataset. The right half shows the result after vertex coloring. Our method recovers more accurate and smoother geometric shapes than implicit methods, particularly in flat and detailed regions.
  • ...and 5 more figures