HVOFusion: Incremental Mesh Reconstruction Using Hybrid Voxel Octree
Shaofan Liu, Junbo Chen, Jianke Zhu
TL;DR
This work tackles online, large-scale scene reconstruction by balancing speed, memory, and surface quality. It introduces HVOFusion, which fuses an octree backbone with hierarchical voxels in leaf nodes to store explicit mesh faces alongside an implicit surface, enabling incremental partial meshes $\mathcal{M}_t$ that are fused into a global mesh $\mathcal{M}$. The method combines a point-based refinement (Chamfer distance to the input point cloud) with a shading-based refinement (differentiable shading using four-component spherical harmonics) to jointly optimize geometry and vertex colors. Experiments on Replica, ScanNet++, and LiDAR datasets show improved geometric detail, realistic coloring, and efficient runtime compared to multiple baselines, demonstrating practical benefits for real-time robotic perception and mapping.
Abstract
Incremental scene reconstruction is essential to the navigation in robotics. Most of the conventional methods typically make use of either TSDF (truncated signed distance functions) volume or neural networks to implicitly represent the surface. Due to the voxel representation or involving with time-consuming sampling, they have difficulty in balancing speed, memory storage, and surface quality. In this paper, we propose a novel hybrid voxel-octree approach to effectively fuse octree with voxel structures so that we can take advantage of both implicit surface and explicit triangular mesh representation. Such sparse structure preserves triangular faces in the leaf nodes and produces partial meshes sequentially for incremental reconstruction. This storage scheme allows us to naturally optimize the mesh in explicit 3D space to achieve higher surface quality. We iteratively deform the mesh towards the target and recovers vertex colors by optimizing a shading model. Experimental results on several datasets show that our proposed approach is capable of quickly and accurately reconstructing a scene with realistic colors.
