Image Sculpting: Precise Object Editing with 3D Geometry Control
Jiraphon Yenphraphai, Xichen Pan, Sainan Liu, Daniele Panozzo, Saining Xie
TL;DR
Image Sculpting introduces a 3D-geometry–driven framework for precise object editing from a single image. It converts 2D content into a textured 3D model, allows interactive 3D deformation, and uses a coarse-to-fine diffusion-based enhancement to produce high-fidelity 2D outputs that preserve geometry and texture. The approach enables precise pose edits, rotations, translations, 3D composition, carving, and serial additions, validated on SculptingBench against strong baselines with quantitative metrics for texture and geometry. By integrating single-view reconstruction, graphics-style deformation, and diffusion-based refinement, the work advances the fusion of graphics pipelines with generative models for controllable, physically plausible image editing.
Abstract
We present Image Sculpting, a new framework for editing 2D images by incorporating tools from 3D geometry and graphics. This approach differs markedly from existing methods, which are confined to 2D spaces and typically rely on textual instructions, leading to ambiguity and limited control. Image Sculpting converts 2D objects into 3D, enabling direct interaction with their 3D geometry. Post-editing, these objects are re-rendered into 2D, merging into the original image to produce high-fidelity results through a coarse-to-fine enhancement process. The framework supports precise, quantifiable, and physically-plausible editing options such as pose editing, rotation, translation, 3D composition, carving, and serial addition. It marks an initial step towards combining the creative freedom of generative models with the precision of graphics pipelines.
