Table of Contents
Fetching ...

GarmentTracking: Category-Level Garment Pose Tracking

Han Xue, Wenqiang Xu, Jieyi Zhang, Tutian Tang, Yutong Li, Wenxin Du, Ruolin Ye, Cewu Lu

TL;DR

This work introduces category-level garment pose tracking by combining a VR-based data collection system (VR-Garment), a large VR-Folding dataset, and a real-time GarmentTracking framework. GarmentTracking predicts garment pose in a canonical space using inter-frame fusion, refines predictions with a NOCS PC-Mesh refiner, and maps canonical geometry to the task space via a warp field, enabling complete pose and surface reconstruction under large non-rigid deformations. Empirical results show significant improvements over prior single-frame methods, strong robustness to perturbations, real-time performance at 15 FPS, and promising generalization to real-world data, making VR-Garment a versatile platform for future garment manipulation research. The work also provides a blueprint for scalable manipulation datasets and end-to-end non-rigid tracking pipelines that can benefit downstream MR/AR and robotic tasks.

Abstract

Garments are important to humans. A visual system that can estimate and track the complete garment pose can be useful for many downstream tasks and real-world applications. In this work, we present a complete package to address the category-level garment pose tracking task: (1) A recording system VR-Garment, with which users can manipulate virtual garment models in simulation through a VR interface. (2) A large-scale dataset VR-Folding, with complex garment pose configurations in manipulation like flattening and folding. (3) An end-to-end online tracking framework GarmentTracking, which predicts complete garment pose both in canonical space and task space given a point cloud sequence. Extensive experiments demonstrate that the proposed GarmentTracking achieves great performance even when the garment has large non-rigid deformation. It outperforms the baseline approach on both speed and accuracy. We hope our proposed solution can serve as a platform for future research. Codes and datasets are available in https://garment-tracking.robotflow.ai.

GarmentTracking: Category-Level Garment Pose Tracking

TL;DR

This work introduces category-level garment pose tracking by combining a VR-based data collection system (VR-Garment), a large VR-Folding dataset, and a real-time GarmentTracking framework. GarmentTracking predicts garment pose in a canonical space using inter-frame fusion, refines predictions with a NOCS PC-Mesh refiner, and maps canonical geometry to the task space via a warp field, enabling complete pose and surface reconstruction under large non-rigid deformations. Empirical results show significant improvements over prior single-frame methods, strong robustness to perturbations, real-time performance at 15 FPS, and promising generalization to real-world data, making VR-Garment a versatile platform for future garment manipulation research. The work also provides a blueprint for scalable manipulation datasets and end-to-end non-rigid tracking pipelines that can benefit downstream MR/AR and robotic tasks.

Abstract

Garments are important to humans. A visual system that can estimate and track the complete garment pose can be useful for many downstream tasks and real-world applications. In this work, we present a complete package to address the category-level garment pose tracking task: (1) A recording system VR-Garment, with which users can manipulate virtual garment models in simulation through a VR interface. (2) A large-scale dataset VR-Folding, with complex garment pose configurations in manipulation like flattening and folding. (3) An end-to-end online tracking framework GarmentTracking, which predicts complete garment pose both in canonical space and task space given a point cloud sequence. Extensive experiments demonstrate that the proposed GarmentTracking achieves great performance even when the garment has large non-rigid deformation. It outperforms the baseline approach on both speed and accuracy. We hope our proposed solution can serve as a platform for future research. Codes and datasets are available in https://garment-tracking.robotflow.ai.
Paper Structure (40 sections, 2 equations, 11 figures, 6 tables)

This paper contains 40 sections, 2 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: The pipeline of our Virtual Realty recording system (VR-Garment). (a) A volunteer needs to put on a VR headset and VR gloves. (b) By following the guidance of a specially designed UI, the volunteer begins to collect data efficiently. (c) After recording, we re-render multi-view RGB-D images with Unityunity and obtain masks and deformed garment meshes with NOCS labels.
  • Figure 2: The overview of GarmentTracking. Given the per-point NOCS coordinate of the first frame and a rough canonical shape (mesh NOCS), our tracking method takes two frames of the partial point cloud as input. In stage 1, the NOCS predictor will generate an inter-frame fusion feature and predict raw NOCS coordinates. In stage 2, the NOCS refiner will refine the NOCS coordinates and the canonical shape simultaneously. In stage 3, the warp field mapper will predict the warp field which maps from canonical space to task space.
  • Figure 3: PC-Mesh Fusion Refiner
  • Figure 4: The canonical coordinate prediction results on the VR-Folding dataset.
  • Figure 5: The qualitative results of pose estimation for unseen instances in VR-Folding dataset. In the long sequence tracking (shown in the lower part), our prediction still keeps high consistency with GT, while GarmentNets outputs a series of meshes that lack stability.
  • ...and 6 more figures