InFusionSurf: Refining Neural RGB-D Surface Reconstruction Using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning
Seunghwan Lee, Gwanmo Park, Hyewon Son, Jiwon Ryu, Han Joo Chae
TL;DR
InFusionSurf tackles the slow convergence and depth-blur challenges of NeRF-based RGB-D surface reconstruction by introducing per-frame intrinsic refinement (PFIR) and a TSDF Fusion prior learning phase. The method employs a dense feature grid with a three-phase optimization that first learns a geometric prior from TSDF Fusion, then progressively refines details with frame-specific ray corrections and higher-resolution grids. Quantitative and qualitative results show superior geometry and faster convergence compared to GO-Surf and Neural RGB-D, with ablations confirming the benefits of PFIR and TSDF priors. This approach enables efficient, high-fidelity 3D reconstruction from RGB-D video, offering practical advantages for real-world applications where depth motion blur and training time are critical factors.
Abstract
We introduce InFusionSurf, an innovative enhancement for neural radiance field (NeRF) frameworks in 3D surface reconstruction using RGB-D video frames. Building upon previous methods that have employed feature encoding to improve optimization speed, we further improve the reconstruction quality with minimal impact on optimization time by refining depth information. InFusionSurf addresses camera motion-induced blurs in each depth frame through a per-frame intrinsic refinement scheme. It incorporates the truncated signed distance field (TSDF) Fusion, a classical real-time 3D surface reconstruction method, as a pretraining tool for the feature grid, enhancing reconstruction details and training speed. Comparative quantitative and qualitative analyses show that InFusionSurf reconstructs scenes with high accuracy while maintaining optimization efficiency. The effectiveness of our intrinsic refinement and TSDF Fusion-based pretraining is further validated through an ablation study.
