Learning to Grasp Clothing Structural Regions for Garment Manipulation Tasks

Wei Chen; Dongmyoung Lee; Digby Chappell; Nicolas Rojas

Learning to Grasp Clothing Structural Regions for Garment Manipulation Tasks

Wei Chen, Dongmyoung Lee, Digby Chappell, Nicolas Rojas

TL;DR

This work tackles the challenge of deformable garment manipulation by focusing on identifying and grasping structural regions, namely collars, to enable tasks like garment hanging. It introduces a depth-based perception pipeline trained from a concise real-world video to segment collars, followed by a skeleton-based center region extraction and surface-variation-driven grasp point selection, with PCA-based pose estimation for robust 6D grasping. The approach achieves high grasping success rates—$92\%$ for a single folded garment, $80\%$ for a single crumpled garment, and $50\%$ for three crumpled garments—and outperforms baselines that ignore garment structure. A garment-hanging task demonstrates that open-loop control is feasible when the grasp is guided by the garment’s structural region, highlighting practical impact for domestic, healthcare, and industrial manipulation of textiles.

Abstract

When performing cloth-related tasks, such as garment hanging, it is often important to identify and grasp certain structural regions -- a shirt's collar as opposed to its sleeve, for instance. However, due to cloth deformability, these manipulation activities, which are essential in domestic, health care, and industrial contexts, remain challenging for robots. In this paper, we focus on how to segment and grasp structural regions of clothes to enable manipulation tasks, using hanging tasks as case study. To this end, a neural network-based perception system is proposed to segment a shirt's collar from areas that represent the rest of the scene in a depth image. With a 10-minute video of a human manipulating shirts to train it, our perception system is capable of generalizing to other shirts regardless of texture as well as to other types of collared garments. A novel grasping strategy is then proposed based on the segmentation to determine grasping pose. Experiments demonstrate that our proposed grasping strategy achieves 92\%, 80\%, and 50\% grasping success rates with one folded garment, one crumpled garment and three crumpled garments, respectively. Our grasping strategy performs considerably better than tested baselines that do not take into account the structural nature of the garments. With the proposed region segmentation and grasping strategy, challenging garment hanging tasks are successfully implemented using an open-loop control policy. Supplementary material is available at https://sites.google.com/view/garment-hanging

Learning to Grasp Clothing Structural Regions for Garment Manipulation Tasks

TL;DR

for a single folded garment,

for a single crumpled garment, and

for three crumpled garments—and outperforms baselines that ignore garment structure. A garment-hanging task demonstrates that open-loop control is feasible when the grasp is guided by the garment’s structural region, highlighting practical impact for domestic, healthcare, and industrial manipulation of textiles.

Abstract

Paper Structure (19 sections, 4 equations, 7 figures, 3 tables)

This paper contains 19 sections, 4 equations, 7 figures, 3 tables.

Introduction
Related Work
Cloth Perception
Cloth Grasping for Manipulation
Methods
Cloth Region Segmentation
Data Acquisition
Semantic Segmentation with Neural Networks
Grasping Pose Estimation
Center Region Extraction
Grasping Position Selection
Grasping Orientation Estimation
Insertion Grasping Execution
Garment-hanging
Experiment
...and 4 more sections

Figures (7)

Figure 1: In the proposed method, a robot learns to automatically detect a clothing structural region (collar in our case study) in severely crumpled garments and execute its grasping. This is then used for cloth manipulation (hanging the garment on a cloth tree in our case study).
Figure 2: Pipeline of Our Method: We collect depth image and its corresponding RGB image of the diverse configuration of garments. The collected depth images are labelled by extracting the blue pixels from RGB images and used as the training set. During the running time, We use the depth image as input to predict the collar by the trained segmentation neural network; The optimal grasping point is obtained by fitting a skeleton to the predicted collar and calculating the surface variation. A local PCA is finally conducted near this point to find the grasping orientation. A set of action primitives, including grasping and hanging, are designed for performing real-world robot execution.
Figure 3: Left: template shirts used for perception system training in Section III-A. Right: Various garments were used in the testing section of our study. The collar is pained with blue for groundtruth extraction.
Figure 4: Center Region Extraction: (a) Input depth image; (b) Prediction result; (c) Clustering; (d) Skeletonize and calculate the center of skeleton. We compare the centriod (blue dot) and skeleton center (red dot).
Figure 5: Grasping Pose Estimation: garment RGB image and extracted collar region pointcloud. The point (red point) with highest surface variation in the center region (dashed circle) will be selected as the grasping point. Three eigenvectors ($\overrightarrow{\bm{v}}_{0}$, $\overrightarrow{\bm{v}}_{1}$, $\overrightarrow{\bm{v}}_{2}$) from this point and its nearest neighbour are used for the pose estimation.
...and 2 more figures

Learning to Grasp Clothing Structural Regions for Garment Manipulation Tasks

TL;DR

Abstract

Learning to Grasp Clothing Structural Regions for Garment Manipulation Tasks

Authors

TL;DR

Abstract

Table of Contents

Figures (7)