RobustNeRF: Ignoring Distractors with Robust Losses

Sara Sabour; Suhani Vora; Daniel Duckworth; Ivan Krasin; David J. Fleet; Andrea Tagliasacchi

RobustNeRF: Ignoring Distractors with Robust Losses

Sara Sabour, Suhani Vora, Daniel Duckworth, Ivan Krasin, David J. Fleet, Andrea Tagliasacchi

TL;DR

RobustNeRF tackles the core NeRF vulnerability to non-persistent distractors by treating distractors as outliers in the optimization objective. It adopts a trimmed least-squares framework within an iteratively reweighted scheme, enhanced with spatially coherent outlier masking that evolves during training, enabling the model to ignore transient content while learning static scene structure. The approach is simple to integrate into existing NeRF pipelines, requires minimal hyperparameter tuning, and yields strong quantitative gains over mip-NeRF 360 and competitive results against D$^2$NeRF on real and synthetic distractor-rich datasets. While it introduces some statistical inefficiency on clean data and longer training times, RobustNeRF demonstrates robust reconstruction in cluttered environments and lays groundwork for further improvements such as learned weighting or applying the loss to other NeRF variants.

Abstract

Neural radiance fields (NeRF) excel at synthesizing new views given multi-view, calibrated images of a static scene. When scenes include distractors, which are not persistent during image capture (moving objects, lighting variations, shadows), artifacts appear as view-dependent effects or 'floaters'. To cope with distractors, we advocate a form of robust estimation for NeRF training, modeling distractors in training data as outliers of an optimization problem. Our method successfully removes outliers from a scene and improves upon our baselines, on synthetic and real-world scenes. Our technique is simple to incorporate in modern NeRF frameworks, with few hyper-parameters. It does not assume a priori knowledge of the types of distractors, and is instead focused on the optimization problem rather than pre-processing or modeling transient objects. More results on our page https://robustnerf.github.io.

RobustNeRF: Ignoring Distractors with Robust Losses

TL;DR

NeRF on real and synthetic distractor-rich datasets. While it introduces some statistical inefficiency on clean data and longer training times, RobustNeRF demonstrates robust reconstruction in cluttered environments and lays groundwork for further improvements such as learned weighting or applying the loss to other NeRF variants.

Abstract

Paper Structure (40 sections, 9 equations, 24 figures)

This paper contains 40 sections, 9 equations, 24 figures.

Introduction
Related Work
Neural Radiance Fields
Recent progress on NeRF models
Modeling non-static scenes
Method
Sensitivity to outliers
Robustness to outliers
Robustness via semantic segmentation
Robust estimators
Robustness via Trimmed Least Squares
Iteratively Reweighted least Squares
Trimmed Robust Kernels
Experiments
Baselines
...and 25 more sections

Figures (24)

Figure 1: NeRF assumes photometric consistency in the observed images of a scene. Violations of this assumption, as with the images in the top row, yield reconstructed scenes with inconsistent content in the form of "floaters" (highlighted with ellipses). We introduce a simple technique that produces clean reconstruction by automatically ignoring distractorswithout explicit supervision.
Figure 2: Ambiguity -- A simple 2D scene where a static object (blue) is captured by three cameras. During the first and third capture the scene is not photo-consistent as a distractor was within the field of view. Not photo-consistent portions of the scene can end up being encoded as view-dependent effects -- even when we assume ground truth geometry.
Figure 3: Histograms -- Robust estimators perform well when the distribution of residuals agrees with the one implied by the estimator (e.g., Gaussian for L2, Laplacian for L1). Here we visualize the ground-truth distribution of residuals (bottom-left), which is hardly a good match with any simple parametric distribution.
Figure 4: Kernels -- (top-left) Family of robust kernels robustloss, including L2 ($\alpha{=}{2}$), Charbonnier ($\alpha{=}{1}$) and Geman-McClure ($\alpha{=}{-2}$). (top-right) Mid-training, residual magnitudes are similar for distractors and fine-grained details, and pixels with large residuals are learned more slowly, as the gradient of re-descending kernels flattens out. (bottom-right) A too aggressive Geman-McClure in down-weighting large residuals removes both outliers and high-frequency detail. (bottom-left) A less aggressive Geman-McClure does not effectively remove outliers.
Figure 5: Algorithm -- We visualize our weight function computed by residuals on two examples: (top) the residuals of a (mid-training) NeRF rendered from a training viewpoint, (bottom) a toy residual image containing residual of small spatial extent (dot, line) and residuals of large spatial extent (squares). Notice residuals with large magnitude but small spatial extent (texture of the box, dot, line) are included in the optimization, while weaker residuals with larger spatial extent are excluded. Note that while we operate on patches, we visualize the weight function on the whole image to facilitate visualization.
...and 19 more figures

RobustNeRF: Ignoring Distractors with Robust Losses

TL;DR

Abstract

RobustNeRF: Ignoring Distractors with Robust Losses

Authors

TL;DR

Abstract

Table of Contents

Figures (24)