Table of Contents
Fetching ...

ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing

Jun-Kun Chen, Yu-Xiong Wang

TL;DR

This paper proposes ProEdit - a simple yet effective framework for high-quality 3D scene editing guided by diffusion distillation in a novel progressive manner that controls the size of FOS and reduces inconsistency by decomposing the overall editing task into several subtasks, which are then executed progressively on the scene.

Abstract

This paper proposes ProEdit - a simple yet effective framework for high-quality 3D scene editing guided by diffusion distillation in a novel progressive manner. Inspired by the crucial observation that multi-view inconsistency in scene editing is rooted in the diffusion model's large feasible output space (FOS), our framework controls the size of FOS and reduces inconsistency by decomposing the overall editing task into several subtasks, which are then executed progressively on the scene. Within this framework, we design a difficulty-aware subtask decomposition scheduler and an adaptive 3D Gaussian splatting (3DGS) training strategy, ensuring high quality and efficiency in performing each subtask. Extensive evaluation shows that our ProEdit achieves state-of-the-art results in various scenes and challenging editing tasks, all through a simple framework without any expensive or sophisticated add-ons like distillation losses, components, or training procedures. Notably, ProEdit also provides a new way to control, preview, and select the "aggressivity" of editing operation during the editing process.

ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing

TL;DR

This paper proposes ProEdit - a simple yet effective framework for high-quality 3D scene editing guided by diffusion distillation in a novel progressive manner that controls the size of FOS and reduces inconsistency by decomposing the overall editing task into several subtasks, which are then executed progressively on the scene.

Abstract

This paper proposes ProEdit - a simple yet effective framework for high-quality 3D scene editing guided by diffusion distillation in a novel progressive manner. Inspired by the crucial observation that multi-view inconsistency in scene editing is rooted in the diffusion model's large feasible output space (FOS), our framework controls the size of FOS and reduces inconsistency by decomposing the overall editing task into several subtasks, which are then executed progressively on the scene. Within this framework, we design a difficulty-aware subtask decomposition scheduler and an adaptive 3D Gaussian splatting (3DGS) training strategy, ensuring high quality and efficiency in performing each subtask. Extensive evaluation shows that our ProEdit achieves state-of-the-art results in various scenes and challenging editing tasks, all through a simple framework without any expensive or sophisticated add-ons like distillation losses, components, or training procedures. Notably, ProEdit also provides a new way to control, preview, and select the "aggressivity" of editing operation during the editing process.

Paper Structure

This paper contains 17 sections, 3 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: By decomposing a difficult task into easy subtasks and then progressively performing them (upper part), our ProEdit achieves high-quality 3D editing results with bright colors and detailed textures along with introducing new controllability of the editing aggressivity (lower part). More results are provided on https://immortalco.github.io/ProEdit/.
  • Figure 2: Our ProEdit framework features three major designs: an interpolation-based subtask formulation (Sec. \ref{['sec:method:subtask-def']}), a difficulty-aware subtask scheduler for subtask decomposition (Sec. \ref{['sec:method:subtask-sch']}), and an adaptive 3DGS tailored for progressive scene editing through a dual-GPU pipeline (Sec. \ref{['sec:method:subtask-3dgs']}). For an editing task, we first decompose it into interpolation-based subtasks to schedule the editing process with the subtask scheduler, and then progressively perform the subtasks with adaptive 3DGS.
  • Figure 3: In the comparative experiments on the Fangzhou and Face scenes, our ProEdit achieves high-quality editing, with strong instruction fidelity, clear textures, and precise shapes across both levels of aggressivity controlled by subtask scheduling. The "medium aggressivity" editing results are obtained from an intermediate subtask. The editing results of the baselines are sourced from visualizations in their respective papers.
  • Figure 4: In the comparative experiments on the ScanNet++ scenes, our simple ProEdit also achieves high-quality editing that is comparable to, and in some cases even outperforms, the sophisticated baseline ConsistDreamer consistdreamer. All visualizations are sourced from ConsistDreamer's paper.
  • Figure 5: In the comparative experiments across various outdoor scenes, our ProEdit not only achieves high-quality editing that surpasses the baselines, but also enables aggressivity controls for a range of scenes and tasks.
  • ...and 4 more figures