Table of Contents
Fetching ...

SealD-NeRF: Interactive Pixel-Level Editing for Dynamic Scenes by Neural Radiance Fields

Zhentao Huang, Yukun Shi, Neil Bruce, Minglun Gong

TL;DR

This work tackles pixel-level editing in dynamic Neural Radiance Fields by extending Seal-3D with a D-NeRF backbone. It leverages a teacher-student distillation where edits defined on a single time frame are projected into the canonical space via a mapping $F_m$ and used to train a student while the deformation network is frozen, enabling consistent propagation across all frames. The authors reimplement D-NeRF in Torch-NGP, introduce brush and sealing tools, and demonstrate both quantitative (PSNR) and qualitative edits across multiple dynamic scenes. Limitations include potential propagation of teacher errors and challenges when edits require altering deformation patterns, with future work suggested in expanding editing tools and considering alternative scene representations such as skeleton-based or Gaussian-based models.

Abstract

The widespread adoption of implicit neural representations, especially Neural Radiance Fields (NeRF), highlights a growing need for editing capabilities in implicit 3D models, essential for tasks like scene post-processing and 3D content creation. Despite previous efforts in NeRF editing, challenges remain due to limitations in editing flexibility and quality. The key issue is developing a neural representation that supports local edits for real-time updates. Current NeRF editing methods, offering pixel-level adjustments or detailed geometry and color modifications, are mostly limited to static scenes. This paper introduces SealD-NeRF, an extension of Seal-3D for pixel-level editing in dynamic settings, specifically targeting the D-NeRF network. It allows for consistent edits across sequences by mapping editing actions to a specific timeframe, freezing the deformation network responsible for dynamic scene representation, and using a teacher-student approach to integrate changes.

SealD-NeRF: Interactive Pixel-Level Editing for Dynamic Scenes by Neural Radiance Fields

TL;DR

This work tackles pixel-level editing in dynamic Neural Radiance Fields by extending Seal-3D with a D-NeRF backbone. It leverages a teacher-student distillation where edits defined on a single time frame are projected into the canonical space via a mapping and used to train a student while the deformation network is frozen, enabling consistent propagation across all frames. The authors reimplement D-NeRF in Torch-NGP, introduce brush and sealing tools, and demonstrate both quantitative (PSNR) and qualitative edits across multiple dynamic scenes. Limitations include potential propagation of teacher errors and challenges when edits require altering deformation patterns, with future work suggested in expanding editing tools and considering alternative scene representations such as skeleton-based or Gaussian-based models.

Abstract

The widespread adoption of implicit neural representations, especially Neural Radiance Fields (NeRF), highlights a growing need for editing capabilities in implicit 3D models, essential for tasks like scene post-processing and 3D content creation. Despite previous efforts in NeRF editing, challenges remain due to limitations in editing flexibility and quality. The key issue is developing a neural representation that supports local edits for real-time updates. Current NeRF editing methods, offering pixel-level adjustments or detailed geometry and color modifications, are mostly limited to static scenes. This paper introduces SealD-NeRF, an extension of Seal-3D for pixel-level editing in dynamic settings, specifically targeting the D-NeRF network. It allows for consistent edits across sequences by mapping editing actions to a specific timeframe, freezing the deformation network responsible for dynamic scene representation, and using a teacher-student approach to integrate changes.
Paper Structure (14 sections, 5 equations, 6 figures, 1 table)

This paper contains 14 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: SealD-NeRF contains two main stages: editing guidance generation and time frame propagation. Firstly, the mapping function $F_m$ is generated based on input from the user interface. It is used for mapping the original source space S to the target space T. The target space is further used for student model training supervision by rendering multiple views. During the student training process, the deformation network $\Psi_t$ is frozen to maintain the object movement. Only the canonical network $\Psi_x$ is optimized to propagate the edit to all the time frames.
  • Figure 2: The user interface of the SealD-NeRF, based on Seal-3D wang2023seal and Torch-NGP Torch-NGP. The user can view the training procedure directly. It allows the user to view any time frame through a control bar. Except for the control parameters of the toolkit, it also allows the user to switch between the student and teacher model for referencing.
  • Figure 3: Examples of the canonical spaces of the scenes: "Lego", "Hook", and "Jumping Jacks". Top row: D-NeRF pumarola2021d implementation based on PyTorch. Bottom row: our implementation based on Torch-NGP Torch-NGP.
  • Figure 4: Examples of the brush/sealing tool editing on the scenes: "Jumping Jacks", "Stand Up", and "Bouncing Balls".
  • Figure 5: Examples of the sealing tool editing on the scenes: "Hell Warrior" and "Mutant".
  • ...and 1 more figures