Table of Contents
Fetching ...

InFusionSurf: Refining Neural RGB-D Surface Reconstruction Using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning

Seunghwan Lee, Gwanmo Park, Hyewon Son, Jiwon Ryu, Han Joo Chae

TL;DR

InFusionSurf tackles the slow convergence and depth-blur challenges of NeRF-based RGB-D surface reconstruction by introducing per-frame intrinsic refinement (PFIR) and a TSDF Fusion prior learning phase. The method employs a dense feature grid with a three-phase optimization that first learns a geometric prior from TSDF Fusion, then progressively refines details with frame-specific ray corrections and higher-resolution grids. Quantitative and qualitative results show superior geometry and faster convergence compared to GO-Surf and Neural RGB-D, with ablations confirming the benefits of PFIR and TSDF priors. This approach enables efficient, high-fidelity 3D reconstruction from RGB-D video, offering practical advantages for real-world applications where depth motion blur and training time are critical factors.

Abstract

We introduce InFusionSurf, an innovative enhancement for neural radiance field (NeRF) frameworks in 3D surface reconstruction using RGB-D video frames. Building upon previous methods that have employed feature encoding to improve optimization speed, we further improve the reconstruction quality with minimal impact on optimization time by refining depth information. InFusionSurf addresses camera motion-induced blurs in each depth frame through a per-frame intrinsic refinement scheme. It incorporates the truncated signed distance field (TSDF) Fusion, a classical real-time 3D surface reconstruction method, as a pretraining tool for the feature grid, enhancing reconstruction details and training speed. Comparative quantitative and qualitative analyses show that InFusionSurf reconstructs scenes with high accuracy while maintaining optimization efficiency. The effectiveness of our intrinsic refinement and TSDF Fusion-based pretraining is further validated through an ablation study.

InFusionSurf: Refining Neural RGB-D Surface Reconstruction Using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning

TL;DR

InFusionSurf tackles the slow convergence and depth-blur challenges of NeRF-based RGB-D surface reconstruction by introducing per-frame intrinsic refinement (PFIR) and a TSDF Fusion prior learning phase. The method employs a dense feature grid with a three-phase optimization that first learns a geometric prior from TSDF Fusion, then progressively refines details with frame-specific ray corrections and higher-resolution grids. Quantitative and qualitative results show superior geometry and faster convergence compared to GO-Surf and Neural RGB-D, with ablations confirming the benefits of PFIR and TSDF priors. This approach enables efficient, high-fidelity 3D reconstruction from RGB-D video, offering practical advantages for real-world applications where depth motion blur and training time are critical factors.

Abstract

We introduce InFusionSurf, an innovative enhancement for neural radiance field (NeRF) frameworks in 3D surface reconstruction using RGB-D video frames. Building upon previous methods that have employed feature encoding to improve optimization speed, we further improve the reconstruction quality with minimal impact on optimization time by refining depth information. InFusionSurf addresses camera motion-induced blurs in each depth frame through a per-frame intrinsic refinement scheme. It incorporates the truncated signed distance field (TSDF) Fusion, a classical real-time 3D surface reconstruction method, as a pretraining tool for the feature grid, enhancing reconstruction details and training speed. Comparative quantitative and qualitative analyses show that InFusionSurf reconstructs scenes with high accuracy while maintaining optimization efficiency. The effectiveness of our intrinsic refinement and TSDF Fusion-based pretraining is further validated through an ablation study.
Paper Structure (24 sections, 14 equations, 6 figures, 2 tables)

This paper contains 24 sections, 14 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Our method proposes per-frame intrinsic refinement and classical TSDF Fusion prior learning schemes for high-quality 3D surface reconstruction with minimal impact on optimization time. We adopt the Neural RGB-D method, revised with a dense feature grid and shallow MLPs. Our per-frame intrinsic refinement scheme compensates for the frame-specific distortion effects caused by the camera motion. The first phase of the training learns geometric prior using the TSDF Fusion algorithm and the later phases adopt a progressive learning technique.
  • Figure 2: Samples from the ScanNet V2 scannet dataset demonstrate the negative impact of motion blurs. The RGB frames (a, b) are blurry and distorted. Unlike the color frames, the depth frames (c) contain extended object boundary rather than averaging blur depth_map_blur.
  • Figure 3: Ablation study. (a) Ours without per-frame intrinsic refinement (PFIR). (b) Ours without TSDF Fusion prior learning in the first phase of training (TSDF). (c) Ours with both methods applied. Timestamps below the subfigures represent the TSDF Fusion prior learning time (if applicable) and the total training time.
  • Figure 4: We compare our method with GO-Surf GOSurf and Neural RGB-D NeuralRGBD at different points in time. The comparison was conducted using scenes 2, 5, 12, and 50 from ScanNet V2 scannet. When trained for a shorter amount of time, InFusionSurf-20K (b) recovers high-frequency details overlooked by GO-Surf (a) and generates much less erroneous surfaces. Given a longer training time, InFusionSurf-75K (c) achieves greater quality while recovering a number of geometries missing from Neural RGB-D (d).
  • Figure 5: Visualization of our per-frame intrinsic refinement module. (a), (b) Color frames with camera motion blur. (c) Superimposed depth frames show incorrectly extended object boundaries. (d) Our per-frame intrinsic refinement module aligns the depth frames with the actual boundaries.
  • ...and 1 more figures