Table of Contents
Fetching ...

RehearsalNeRF: Decoupling Intrinsic Neural Fields of Dynamic Illuminations for Scene Editing

Changyeon Won, Hyunjun Jung, Jungu Cho, Seonmi Park, Chi-Hoon Lee, Hae-Gon Jeon

Abstract

Although there has been significant progress in neural radiance fields, an issue on dynamic illumination changes still remains unsolved. Different from relevant works that parameterize time-variant/-invariant components in scenes, subjects' radiance is highly entangled with their own emitted radiance and lighting colors in spatio-temporal domain. In this paper, we present a new effective method to learn disentangled neural fields under the severe illumination changes, named RehearsalNeRF. Our key idea is to leverage scenes captured under stable lighting like rehearsal stages, easily taken before dynamic illumination occurs, to enforce geometric consistency between the different lighting conditions. In particular, RehearsalNeRF employs a learnable vector for lighting effects which represents illumination colors in a temporal dimension and is used to disentangle projected light colors from scene radiance. Furthermore, our RehearsalNeRF is also able to reconstruct the neural fields of dynamic objects by simply adopting off-the-shelf interactive masks. To decouple the dynamic objects, we propose a new regularization leveraging optical flow, which provides coarse supervision for the color disentanglement. We demonstrate the effectiveness of RehearsalNeRF by showing robust performances on novel view synthesis and scene editing under dynamic illumination conditions. Our source code and video datasets will be publicly available.

RehearsalNeRF: Decoupling Intrinsic Neural Fields of Dynamic Illuminations for Scene Editing

Abstract

Although there has been significant progress in neural radiance fields, an issue on dynamic illumination changes still remains unsolved. Different from relevant works that parameterize time-variant/-invariant components in scenes, subjects' radiance is highly entangled with their own emitted radiance and lighting colors in spatio-temporal domain. In this paper, we present a new effective method to learn disentangled neural fields under the severe illumination changes, named RehearsalNeRF. Our key idea is to leverage scenes captured under stable lighting like rehearsal stages, easily taken before dynamic illumination occurs, to enforce geometric consistency between the different lighting conditions. In particular, RehearsalNeRF employs a learnable vector for lighting effects which represents illumination colors in a temporal dimension and is used to disentangle projected light colors from scene radiance. Furthermore, our RehearsalNeRF is also able to reconstruct the neural fields of dynamic objects by simply adopting off-the-shelf interactive masks. To decouple the dynamic objects, we propose a new regularization leveraging optical flow, which provides coarse supervision for the color disentanglement. We demonstrate the effectiveness of RehearsalNeRF by showing robust performances on novel view synthesis and scene editing under dynamic illumination conditions. Our source code and video datasets will be publicly available.

Paper Structure

This paper contains 21 sections, 11 equations, 22 figures, 8 tables.

Figures (22)

  • Figure 1: RehearsalNeRF jointly optimizes five neural fields for dynamic lighting and static/moving subjects in a training step. Each field cannot only be rendered independently, but can also be composited at once to represent a whole scene. Compared to the baseline Wu2022d2nerf, the details from RehearsalNeRF are distinguishable and the rendered colors are well decoupled from the scene lights. We provide a variety of applications for video editing, such as controlling lighting while stopping motion and vice versa.
  • Figure 2: An overview of RehearsalNeRF in training phase. Our RehearsalNeRF consists of five neural fields for rendering static, dynamic objects and illuminations. Each field takes location ($\mathbf{x}$), viewing direction ($\mathbf{d}$) and time-stamp ($\tau$) for the same sample point as input, except the time-stamp for the static fields. They are used to infer the radiance. Particularly, the illumination field predicts the probability $P_H$ of the illumination vector $v_h$ corresponding to $\tau$. To decouple static/dynamic objects and illumination components of the main stage video well, two additional regularizations, $L_{reh}$ and $L_{dyn}$, using the rehearsal prior, are designed.
  • Figure 3: Visualizations of the procedure how to work the illumination vector. (a) The initialization of the learnable illumination vector is done by sampling the hue channel of the dynamic lighting from the difference between the rehearsal and the main stage video. (b) Back propagation from the rendered hue channel is achieved by computing a weighted sum of the probability $P_H$ and the slice of the $v_h$ for time $\tau$, which finally optimize the hue channel of the dynamic lighting effects.
  • Figure 4: Qualitative comparisons on our real-world dataset. The goal of scene editing is to render the red-colored illumination. Only our method produces the high-quality edited scenes through the successful dynamic illumination decomposition.
  • Figure 5: Qualitative comparison of novel view synthesis performance for dynamic objects on our real-world dataset. The second row presents error map between rendered image and GT. Our model consistently achieves superior quality in dynamic regions.
  • ...and 17 more figures