DragAPart: Learning a Part-Level Motion Prior for Articulated Objects
Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
TL;DR
DragAPart presents a motion prior for articulated objects by fine-tuning a pre-trained image generator on a synthetic Drag-a-Move dataset with a novel multi-resolution drag encoding. It enables part-level deformations in response to drags and generalizes to real images and unseen categories through domain randomization. The approach outperforms prior drag-conditioned methods in both quantitative metrics and qualitative assessments, and enables downstream tasks such as moving-part segmentation and motion analysis. The Drag-a-Move dataset provides ground-truth drags and articulations to support data-driven learning of fine-grained dynamics.
Abstract
We introduce DragAPart, a method that, given an image and a set of drags as input, generates a new image of the same object that responds to the action of the drags. Differently from prior works that focused on repositioning objects, DragAPart predicts part-level interactions, such as opening and closing a drawer. We study this problem as a proxy for learning a generalist motion model, not restricted to a specific kinematic structure or object category. We start from a pre-trained image generator and fine-tune it on a new synthetic dataset, Drag-a-Move, which we introduce. Combined with a new encoding for the drags and dataset randomization, the model generalizes well to real images and different categories. Compared to prior motion-controlled generators, we demonstrate much better part-level motion understanding.
