Robot Deformable Object Manipulation via NMPC-generated Demonstrations in Deep Reinforcement Learning
Haoyuan Wang, Zihao Dong, Hongliang Lei, Zejia Zhang, Weizhuang Shi, Wei Luo, Weiwei Wan, Jian Huang
TL;DR
This work tackles deformable-object robot manipulation by combining demonstration-enhanced DRL with NMPC-generated demonstrations. The core method, HGCR-DDPG, integrates a high-dimensional HTSK fuzzy grasp-point selector, a GABC-augmented Rainbow-DDPG, and a CPL framework to improve learning efficiency from demonstrations. It also introduces a low-cost NMPC-based data-collection pipeline using a spring-mass model to generate useful demonstrations without extensive human labor, and validates the approach through simulation and physical experiments on diagonal folding, central-axis folding, and flattening, achieving strong success rates. Compared with large-model baselines, HGCR-DDPG delivers superior performance with lower computational needs, while NMPC demonstrations provide near-parity with human data, enabling practical, task-specific adaptation for deformable-object manipulation.
Abstract
In this work, we conducted research on deformable object manipulation by robots based on demonstration-enhanced reinforcement learning (RL). To improve the learning efficiency of RL, we enhanced the utilization of demonstration data from multiple aspects and proposed the HGCR-DDPG algorithm. It uses a novel high-dimensional fuzzy approach for grasping-point selection, a refined behavior-cloning method to enhance data-driven learning in Rainbow-DDPG, and a sequential policy-learning strategy. Compared to the baseline algorithm (Rainbow-DDPG), our proposed HGCR-DDPG achieved 2.01 times the global average reward and reduced the global average standard deviation to 45% of that of the baseline algorithm. To reduce the human labor cost of demonstration collection, we proposed a low-cost demonstration collection method based on Nonlinear Model Predictive Control (NMPC). Simulation experiment results show that demonstrations collected through NMPC can be used to train HGCR-DDPG, achieving comparable results to those obtained with human demonstrations. To validate the feasibility of our proposed methods in real-world environments, we conducted physical experiments involving deformable object manipulation. We manipulated fabric to perform three tasks: diagonal folding, central axis folding, and flattening. The experimental results demonstrate that our proposed method achieved success rates of 83.3%, 80%, and 100% for these three tasks, respectively, validating the effectiveness of our approach. Compared to current large-model approaches for robot manipulation, the proposed algorithm is lightweight, requires fewer computational resources, and offers task-specific customization and efficient adaptability for specific tasks.
