Table of Contents
Fetching ...

Improving Geometric Consistency for 360-Degree Neural Radiance Fields in Indoor Scenarios

Iryna Repinetska, Anna Hilsmann, Peter Eisert

TL;DR

This work targets floaters and geometric inconsistencies in indoor 360-degree NeRF renderings by introducing dense depth priors for planar architectural surfaces (walls, floors, ceilings) and a boundary-focused depth loss (BoundL), complemented by patch-based depth regularization. The framework combines depth supervision with a fast, plane-based depth estimation pipeline and a loss function that enforces boundaries, achieving superior geometry and reduced artifacts on synthetic 360-degree indoor scenes using Instant-NGP. Quantitative results show BoundL with joint bilateral filtering yields the highest PSNR/SSIM and lowest LPIPS, along with faster convergence compared to RGB-only or standard depth supervision. The approach demonstrates robust improvements in indoor scene fidelity and proposes releasing the synthetic dataset to support future research, with plans to extend to real-world data and incorporate sparse depth information.

Abstract

Photo-realistic rendering and novel view synthesis play a crucial role in human-computer interaction tasks, from gaming to path planning. Neural Radiance Fields (NeRFs) model scenes as continuous volumetric functions and achieve remarkable rendering quality. However, NeRFs often struggle in large, low-textured areas, producing cloudy artifacts known as ''floaters'' that reduce scene realism, especially in indoor environments with featureless architectural surfaces like walls, ceilings, and floors. To overcome this limitation, prior work has integrated geometric constraints into the NeRF pipeline, typically leveraging depth information derived from Structure from Motion or Multi-View Stereo. Yet, conventional RGB-feature correspondence methods face challenges in accurately estimating depth in textureless regions, leading to unreliable constraints. This challenge is further complicated in 360-degree ''inside-out'' views, where sparse visual overlap between adjacent images further hinders depth estimation. In order to address these issues, we propose an efficient and robust method for computing dense depth priors, specifically tailored for large low-textured architectural surfaces in indoor environments. We introduce a novel depth loss function to enhance rendering quality in these challenging, low-feature regions, while complementary depth-patch regularization further refines depth consistency across other areas. Experiments with Instant-NGP on two synthetic 360-degree indoor scenes demonstrate improved visual fidelity with our method compared to standard photometric loss and Mean Squared Error depth supervision.

Improving Geometric Consistency for 360-Degree Neural Radiance Fields in Indoor Scenarios

TL;DR

This work targets floaters and geometric inconsistencies in indoor 360-degree NeRF renderings by introducing dense depth priors for planar architectural surfaces (walls, floors, ceilings) and a boundary-focused depth loss (BoundL), complemented by patch-based depth regularization. The framework combines depth supervision with a fast, plane-based depth estimation pipeline and a loss function that enforces boundaries, achieving superior geometry and reduced artifacts on synthetic 360-degree indoor scenes using Instant-NGP. Quantitative results show BoundL with joint bilateral filtering yields the highest PSNR/SSIM and lowest LPIPS, along with faster convergence compared to RGB-only or standard depth supervision. The approach demonstrates robust improvements in indoor scene fidelity and proposes releasing the synthetic dataset to support future research, with plans to extend to real-world data and incorporate sparse depth information.

Abstract

Photo-realistic rendering and novel view synthesis play a crucial role in human-computer interaction tasks, from gaming to path planning. Neural Radiance Fields (NeRFs) model scenes as continuous volumetric functions and achieve remarkable rendering quality. However, NeRFs often struggle in large, low-textured areas, producing cloudy artifacts known as ''floaters'' that reduce scene realism, especially in indoor environments with featureless architectural surfaces like walls, ceilings, and floors. To overcome this limitation, prior work has integrated geometric constraints into the NeRF pipeline, typically leveraging depth information derived from Structure from Motion or Multi-View Stereo. Yet, conventional RGB-feature correspondence methods face challenges in accurately estimating depth in textureless regions, leading to unreliable constraints. This challenge is further complicated in 360-degree ''inside-out'' views, where sparse visual overlap between adjacent images further hinders depth estimation. In order to address these issues, we propose an efficient and robust method for computing dense depth priors, specifically tailored for large low-textured architectural surfaces in indoor environments. We introduce a novel depth loss function to enhance rendering quality in these challenging, low-feature regions, while complementary depth-patch regularization further refines depth consistency across other areas. Experiments with Instant-NGP on two synthetic 360-degree indoor scenes demonstrate improved visual fidelity with our method compared to standard photometric loss and Mean Squared Error depth supervision.

Paper Structure

This paper contains 11 sections, 11 equations, 20 figures, 1 table.

Figures (20)

  • Figure 1: Raw images captured with a pinhole camera model, showing unstitched frames prior to assembly into a 360-degree panorama. The living room is depicted in the first two rows and the bedroom in the last two. The first 15 images (from top left to bottom right) depict a 360-degree horizontal sweep, while the final 5 images capture the upper surroundings.
  • Figure 2: Illustration of a Gaussian distribution modeling the weight $w_i$ along a ray which hits the boundary surface (e.g., a wall) depicted by the red dotted line. The purple solid line indicates the ray with the green dots representing samples.
  • Figure 3: Renderings with Instant-NGP trained on our 360-degree indoor dataset using photometric loss show high visual fidelity on detail-rich areas.
  • Figure 4: Renderings produced by Instant-NGP trained on our 360-degree indoor dataset with photometric loss are displayed alongside their corresponding depth maps. Red bounding boxes highlight floaters in front of walls, ceilings, or floors, caused by incorrect depth estimations.
  • Figure 5: RGB
  • ...and 15 more figures