UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence
Ruihai Wu, Haoran Lu, Yiyan Wang, Yubo Wang, Hao Dong
TL;DR
This paper tackles category-level garment manipulation under diverse geometries and deformations by learning dense visual correspondence that is aware of topology and function. The approach first learns deformation- and object-agnostic point representations through self-supervised cross-deformation ($L_{CD}$) and cross-object ($L_{CO}$) contrastive learning, refined by a coarse-to-fine loss ($L_{C2F}$), and then adapts to specific tasks with few-shot functional fine-tuning. A skeleton-based topology (via Skeleton Merger) enables robust cross-object correspondence, while projection from flat to deformed states unifies the representations across garment states. Extensive simulation across three garment categories and three tasks, plus real-world evaluation with dual-arm manipulation, demonstrates superior generalization and policy generation from dense correspondences, enabling one-model, multi-task garment manipulation with minimal demonstrations.
Abstract
Garment manipulation (e.g., unfolding, folding and hanging clothes) is essential for future robots to accomplish home-assistant tasks, while highly challenging due to the diversity of garment configurations, geometries and deformations. Although able to manipulate similar shaped garments in a certain task, previous works mostly have to design different policies for different tasks, could not generalize to garments with diverse geometries, and often rely heavily on human-annotated data. In this paper, we leverage the property that, garments in a certain category have similar structures, and then learn the topological dense (point-level) visual correspondence among garments in the category level with different deformations in the self-supervised manner. The topological correspondence can be easily adapted to the functional correspondence to guide the manipulation policies for various downstream tasks, within only one or few-shot demonstrations. Experiments over garments in 3 different categories on 3 representative tasks in diverse scenarios, using one or two arms, taking one or more steps, inputting flat or messy garments, demonstrate the effectiveness of our proposed method. Project page: https://warshallrho.github.io/unigarmentmanip.
