Table of Contents
Fetching ...

ObjectMorpher: 3D-Aware Image Editing via Deformable 3DGS Models

Yuhuan Xie, Aoxuan Pan, Yi-Hua Huang, Chirui Chang, Peng Dai, Xin Yu, Xiaojuan Qi

Abstract

Achieving precise, object-level control in image editing remains challenging: 2D methods lack 3D awareness and often yield ambiguous or implausible results, while existing 3D-aware approaches rely on heavy optimization or incomplete monocular reconstructions. We present ObjectMorpher, a unified, interactive framework that converts ambiguous 2D edits into geometry-grounded operations. ObjectMorpher lifts target instances with an image-to-3D generator into editable 3D Gaussian Splatting (3DGS), enabling fast, identity-preserving manipulation. Users drag control points; a graph-based non-rigid deformation with as-rigid-as-possible (ARAP) constraints ensures physically sensible shape and pose changes. A composite diffusion module harmonizes lighting, color, and boundaries for seamless reintegration. Across diverse categories, ObjectMorpher delivers fine-grained, photorealistic edits with superior controllability and efficiency, outperforming 2D drag and 3D-aware baselines on KID, LPIPS, SIFID, and user preference.

ObjectMorpher: 3D-Aware Image Editing via Deformable 3DGS Models

Abstract

Achieving precise, object-level control in image editing remains challenging: 2D methods lack 3D awareness and often yield ambiguous or implausible results, while existing 3D-aware approaches rely on heavy optimization or incomplete monocular reconstructions. We present ObjectMorpher, a unified, interactive framework that converts ambiguous 2D edits into geometry-grounded operations. ObjectMorpher lifts target instances with an image-to-3D generator into editable 3D Gaussian Splatting (3DGS), enabling fast, identity-preserving manipulation. Users drag control points; a graph-based non-rigid deformation with as-rigid-as-possible (ARAP) constraints ensures physically sensible shape and pose changes. A composite diffusion module harmonizes lighting, color, and boundaries for seamless reintegration. Across diverse categories, ObjectMorpher delivers fine-grained, photorealistic edits with superior controllability and efficiency, outperforming 2D drag and 3D-aware baselines on KID, LPIPS, SIFID, and user preference.

Paper Structure

This paper contains 25 sections, 5 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Unlike text-based methods that fail to localize subjects or interpret geometry, ObjectMorpher uses direct 3D manipulation with real-time interaction. This ensures precise edits while preserving the object's identity and background.
  • Figure 2: Overview of our image editing pipeline. The object is lifted from 2D pixels to high-fidelity 3DGS. Real-time editing with local rigidity is applied based on user input. The object is then repositioned, and a generative model refines the edits for harmonious results.
  • Figure 3: Comparison of graph connections based on Euclidean distance and geodesic distance.
  • Figure 4: Illustration of the training data preparation and the pipeline of our generative composition model.
  • Figure 5: Qualitative comparisons. We show results on 8 subjects (rows) across 5 methods (columns). Our method (Ours) is the only one that faithfully follows the non-rigid user guidance while maintaining photorealism. 2D methods fail to perform the 3D-aware edit, producing results nearly identical to the origin.
  • ...and 3 more figures