INPC: Implicit Neural Point Clouds for Radiance Field Rendering
Florian Hahlbohm, Linus Franke, Moritz Kappel, Susana Castillo, Martin Eisemann, Marc Stamminger, Marcus Magnor
TL;DR
This work introduces Implicit Neural Point Clouds (INPC), a hybrid representation that encodes geometry with an octree-based point probability field and appearance with a multi-resolution hash grid, enabling extraction of explicit point clouds and fast rasterization-based rendering for unbounded scenes. By combining view-specific and view-independent sampling, differentiable bilinear splatting, and an end-to-end optimization pipeline with robust and perceptual losses, INPC achieves state-of-the-art perceptual image quality on challenging benchmarks while offering interactive rendering speeds on consumer hardware. The approach avoids heavy ray-marching and explicit priors, yet preserves geometric detail, and can convert trained models into dense explicit point clouds to further boost performance. Extensive experiments, including ablations and a perceptual study against Zip-NeRF, demonstrate improved visual fidelity and robust optimization, highlighting the practical impact of a scalable, hybrid radiance-field representation.
Abstract
We introduce a new approach for reconstruction and novel view synthesis of unbounded real-world scenes. In contrast to previous methods using either volumetric fields, grid-based models, or discrete point cloud proxies, we propose a hybrid scene representation, which implicitly encodes the geometry in a continuous octree-based probability field and view-dependent appearance in a multi-resolution hash grid. This allows for extraction of arbitrary explicit point clouds, which can be rendered using rasterization. In doing so, we combine the benefits of both worlds and retain favorable behavior during optimization: Our novel implicit point cloud representation and differentiable bilinear rasterizer enable fast rendering while preserving the fine geometric detail captured by volumetric neural fields. Furthermore, this representation does not depend on priors like structure-from-motion point clouds. Our method achieves state-of-the-art image quality on common benchmarks. Furthermore, we achieve fast inference at interactive frame rates, and can convert our trained model into a large, explicit point cloud to further enhance performance.
