Table of Contents
Fetching ...

ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior

Xiao Han, Runze Tian, Yifei Tong, Fenggen Yu, Dingyao Liu, Yan Zhang

TL;DR

ARAP-GS tackles drag-driven editing of 3D Gaussian Splatting by applying As-Rigid-As-Possible deformation directly to a representative subset of Gaussians and interpolating to the rest, followed by diffusion-prior–based refinement to preserve appearance across views. The method merges a geometry-preserving ARAP deformation stage with a diffusion-prior image enhancement stage that uses Iterative Dataset Update and mask-guided fine-tuning to reduce view-inconsistency artifacts. Experiments demonstrate superior qualitative and quantitative performance over baselines, achieving high visual quality, strong multi-view coherence, and efficiency on a single RTX 3090 (roughly 10–20 minutes per scene). Overall, ARAP-GS provides a practical, accurate, and efficient framework for intuitive 3DGS editing with robust cross-view consistency and improved rendering fidelity.

Abstract

Drag-driven editing has become popular among designers for its ability to modify complex geometric structures through simple and intuitive manipulation, allowing users to adjust and reshape content with minimal technical skill. This drag operation has been incorporated into numerous methods to facilitate the editing of 2D images and 3D meshes in design. However, few studies have explored drag-driven editing for the widely-used 3D Gaussian Splatting (3DGS) representation, as deforming 3DGS while preserving shape coherence and visual continuity remains challenging. In this paper, we introduce ARAP-GS, a drag-driven 3DGS editing framework based on As-Rigid-As-Possible (ARAP) deformation. Unlike previous 3DGS editing methods, we are the first to apply ARAP deformation directly to 3D Gaussians, enabling flexible, drag-driven geometric transformations. To preserve scene appearance after deformation, we incorporate an advanced diffusion prior for image super-resolution within our iterative optimization process. This approach enhances visual quality while maintaining multi-view consistency in the edited results. Experiments show that ARAP-GS outperforms current methods across diverse 3D scenes, demonstrating its effectiveness and superiority for drag-driven 3DGS editing. Additionally, our method is highly efficient, requiring only 10 to 20 minutes to edit a scene on a single RTX 3090 GPU.

ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior

TL;DR

ARAP-GS tackles drag-driven editing of 3D Gaussian Splatting by applying As-Rigid-As-Possible deformation directly to a representative subset of Gaussians and interpolating to the rest, followed by diffusion-prior–based refinement to preserve appearance across views. The method merges a geometry-preserving ARAP deformation stage with a diffusion-prior image enhancement stage that uses Iterative Dataset Update and mask-guided fine-tuning to reduce view-inconsistency artifacts. Experiments demonstrate superior qualitative and quantitative performance over baselines, achieving high visual quality, strong multi-view coherence, and efficiency on a single RTX 3090 (roughly 10–20 minutes per scene). Overall, ARAP-GS provides a practical, accurate, and efficient framework for intuitive 3DGS editing with robust cross-view consistency and improved rendering fidelity.

Abstract

Drag-driven editing has become popular among designers for its ability to modify complex geometric structures through simple and intuitive manipulation, allowing users to adjust and reshape content with minimal technical skill. This drag operation has been incorporated into numerous methods to facilitate the editing of 2D images and 3D meshes in design. However, few studies have explored drag-driven editing for the widely-used 3D Gaussian Splatting (3DGS) representation, as deforming 3DGS while preserving shape coherence and visual continuity remains challenging. In this paper, we introduce ARAP-GS, a drag-driven 3DGS editing framework based on As-Rigid-As-Possible (ARAP) deformation. Unlike previous 3DGS editing methods, we are the first to apply ARAP deformation directly to 3D Gaussians, enabling flexible, drag-driven geometric transformations. To preserve scene appearance after deformation, we incorporate an advanced diffusion prior for image super-resolution within our iterative optimization process. This approach enhances visual quality while maintaining multi-view consistency in the edited results. Experiments show that ARAP-GS outperforms current methods across diverse 3D scenes, demonstrating its effectiveness and superiority for drag-driven 3DGS editing. Additionally, our method is highly efficient, requiring only 10 to 20 minutes to edit a scene on a single RTX 3090 GPU.

Paper Structure

This paper contains 17 sections, 7 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Results of ARAP-GS. Given a set of handle points and their deformations, ARAP-GS can efficiently achieve drag-driven 3DGS editing. Our method deforms the geometry of the 3DGS scene through rotation (above) or stretching (below) while preserving the original appearance and multi-view consistency. The first column illustrates the dragging operation, with the red points indicating the handle points and the arrows indicating the dragging directions. "Original" and "Edited" denote the rendering results before and after editing.
  • Figure 2: Method overview. Our method is implemented in two stages. In the first stage, for geometric deformations during editing, we leverage the explicit representation of 3D Gaussians, establishing a representative subset $Q$. Then we apply the traditional ARAP deformation directly to 3D Gaussians in $Q$, and obtain the rotation matrix $R$ and new position $p'$ for each 3D Gaussian. The rotation matrices and new positions of the remaining 3D Gaussians are then inferred by interpolating those of the nearest neighbors in $Q$ (\ref{['sec3.2:arap']}). The deformation changes the position and covariance of the 3D Gaussians. To further update properties such as color and scale values of 3D Gaussians, we fine-tune the deformed 3D Gaussians in the second stage. Specifically, we utilize 2D diffusion prior to remove artifacts on the rendered images and iteratively optimize the 3D Gaussians based on the enhanced images (\ref{['sec3.3:stablesr']}).
  • Figure 3: Visual results of our method. The first two rows show the results of stretching, and last two rows show the results of rotation. The first column shows the dragging operation, with the red points indicating the handle points and the arrows indicating the dragging directions. "Original" and "Edited" denote the rendering results before and after editing.
  • Figure 4: Qualitative comparison results. We compare our method with state-of-the-art methods including I-N2N haque2023instruct, GaussianEditor chen2024gaussianeditor, DragDiffusion shi2024dragdiffusion and SDEDrag nie2023blessing. Compared to these methods, our method achieves more accurate geometric deformation and better multi-view consistency, delivering higher visual quality results.
  • Figure 5: Effectiveness of ARAP Deformation. We select two views to compare the results with and without ARAP Deformation. We magnify the deformed region to highlight our method's ability to preserve the geometric structure of the scene after deformation.
  • ...and 2 more figures