Neural Radiance Fields with Torch Units
Bingnan Ni, Huanyu Wang, Dongfeng Bai, Minghe Weng, Dexin Qi, Weichao Qiu, Bingbing Liu
TL;DR
This work tackles the challenge of NeRF-based reconstruction in complex, large-scale scenes where conventional per-pixel ray querying suffers from weak contextuality and background variance. It introduces Torch-NeRF, which renders a patch of pixels per camera ray and employs distance-aware convolutions along rays to model interactions among sample points, paired with a coarse/fine training scheme that enhances efficiency. The approach achieves significant improvements over baselines on KITTI-360 and LLFF in PSNR, SSIM, and LPIPS, without requiring semantic priors, demonstrating stronger structure preservation and reduced noise in challenging outdoor environments. Overall, Torch-NeRF advances scalable neural radiance field reconstruction by expanding the ray perception field and enabling contextualized, patch-based rendering.
Abstract
Neural Radiance Fields (NeRF) give rise to learning-based 3D reconstruction methods widely used in industrial applications. Although prevalent methods achieve considerable improvements in small-scale scenes, accomplishing reconstruction in complex and large-scale scenes is still challenging. First, the background in complex scenes shows a large variance among different views. Second, the current inference pattern, $i.e.$, a pixel only relies on an individual camera ray, fails to capture contextual information. To solve these problems, we propose to enlarge the ray perception field and build up the sample points interactions. In this paper, we design a novel inference pattern that encourages a single camera ray possessing more contextual information, and models the relationship among sample points on each camera ray. To hold contextual information,a camera ray in our proposed method can render a patch of pixels simultaneously. Moreover, we replace the MLP in neural radiance field models with distance-aware convolutions to enhance the feature propagation among sample points from the same camera ray. To summarize, as a torchlight, a ray in our proposed method achieves rendering a patch of image. Thus, we call the proposed method, Torch-NeRF. Extensive experiments on KITTI-360 and LLFF show that the Torch-NeRF exhibits excellent performance.
