DreamArt: Generating Interactable Articulated Objects from a Single Image

Ruijie Lu; Yu Liu; Jiaxiang Tang; Junfeng Ni; Yuxiang Wang; Diwen Wan; Gang Zeng; Yixin Chen; Siyuan Huang

DreamArt: Generating Interactable Articulated Objects from a Single Image

Ruijie Lu, Yu Liu, Jiaxiang Tang, Junfeng Ni, Yuxiang Wang, Diwen Wan, Gang Zeng, Yixin Chen, Siyuan Huang

TL;DR

DreamArt tackles the challenge of generating interactable articulated 3D assets from a single image. It introduces a three-stage pipeline: part-aware 3D object generation with mask-guided segmentation and amodal completion, articulation video synthesis using movable-part masks and amodal cues, and joint estimation with a differentiable texture refinement to realize plausible motion. The approach demonstrates state-of-the-art performance in articulation video synthesis and video-conditioned asset generation, with strong generalization to in-the-wild images. This work enables scalable production of high-fidelity, manipulable assets for embodied AI, AR/VR, and robotics.

Abstract

Generating articulated objects, such as laptops and microwaves, is a crucial yet challenging task with extensive applications in Embodied AI and AR/VR. Current image-to-3D methods primarily focus on surface geometry and texture, neglecting part decomposition and articulation modeling. Meanwhile, neural reconstruction approaches (e.g., NeRF or Gaussian Splatting) rely on dense multi-view or interaction data, limiting their scalability. In this paper, we introduce DreamArt, a novel framework for generating high-fidelity, interactable articulated assets from single-view images. DreamArt employs a three-stage pipeline: firstly, it reconstructs part-segmented and complete 3D object meshes through a combination of image-to-3D generation, mask-prompted 3D segmentation, and part amodal completion. Second, we fine-tune a video diffusion model to capture part-level articulation priors, leveraging movable part masks as prompt and amodal images to mitigate ambiguities caused by occlusion. Finally, DreamArt optimizes the articulation motion, represented by a dual quaternion, and conducts global texture refinement and repainting to ensure coherent, high-quality textures across all parts. Experimental results demonstrate that DreamArt effectively generates high-quality articulated objects, possessing accurate part shape, high appearance fidelity, and plausible articulation, thereby providing a scalable solution for articulated asset generation. Our project page is available at https://dream-art-0.github.io/DreamArt/.

DreamArt: Generating Interactable Articulated Objects from a Single Image

TL;DR

Abstract

DreamArt: Generating Interactable Articulated Objects from a Single Image

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)