Table of Contents
Fetching ...

Learning Goal-oriented Bimanual Dough Rolling Using Dynamic Heterogeneous Graph Based on Human Demonstration

Junjia Liu, Chenzui Li, Shixiong Wang, Zhipeng Dong, Sylvain Calinon, Miao Li, Fei Chen

TL;DR

A dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies utilizes graphs as a unified representation for both states and policy learning, demonstrating its superiority in achieving human-like behavior.

Abstract

Soft object manipulation poses significant challenges for robots, requiring effective techniques for state representation and manipulation policy learning. State representation involves capturing the dynamic changes in the environment, while manipulation policy learning focuses on establishing the relationship between robot actions and state transformations to achieve specific goals. To address these challenges, this research paper introduces a novel approach: a dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies. The proposed model utilizes graphs as a unified representation for both states and policy learning. By leveraging the dynamic graph, we can extract crucial information regarding object dynamics and manipulation policies. Furthermore, the model facilitates the integration of demonstrations, enabling guided policy learning. To evaluate the efficacy of our approach, we designed a dough rolling task and conducted experiments using both a differentiable simulator and a real-world humanoid robot. Additionally, several ablation studies were performed to analyze the effect of our method, demonstrating its superiority in achieving human-like behavior.

Learning Goal-oriented Bimanual Dough Rolling Using Dynamic Heterogeneous Graph Based on Human Demonstration

TL;DR

A dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies utilizes graphs as a unified representation for both states and policy learning, demonstrating its superiority in achieving human-like behavior.

Abstract

Soft object manipulation poses significant challenges for robots, requiring effective techniques for state representation and manipulation policy learning. State representation involves capturing the dynamic changes in the environment, while manipulation policy learning focuses on establishing the relationship between robot actions and state transformations to achieve specific goals. To address these challenges, this research paper introduces a novel approach: a dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies. The proposed model utilizes graphs as a unified representation for both states and policy learning. By leveraging the dynamic graph, we can extract crucial information regarding object dynamics and manipulation policies. Furthermore, the model facilitates the integration of demonstrations, enabling guided policy learning. To evaluate the efficacy of our approach, we designed a dough rolling task and conducted experiments using both a differentiable simulator and a real-world humanoid robot. Additionally, several ablation studies were performed to analyze the effect of our method, demonstrating its superiority in achieving human-like behavior.

Paper Structure

This paper contains 16 sections, 6 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Perform a bimanual dough rolling task with a rolling pin on a CURI robot by using the pre-trained dynamic heterogeneous graph model.
  • Figure 2: The proposed pipeline of soft object manipulation, which contains two parts: dynamic heterogeneous graph policy learning in the simulator with the guidance of demonstration, and planning and control for deploying on the real-world humanoid robot.
  • Figure 3: The learning performance between baselines and DGform is compared from rewards to three evaluation metrics (SDF, density, and IoU) with the help of the differentiable simulator. The comparison between PPO-based methods demonstrate the effective of graph representation, while three DGform variants show the feasibility of learning policy via graph and the role of demonstration guidance and Lagrangian terms.
  • Figure 4: An example of DGform inference in the simulator. The short horizon $H$ in the simulation is set to be $50$. The rolling pin will be reset to its initial state after each short horizon and inferred based on the current deformation and the goal state. The first row shows the sequential rolling pin movements sample, and the second is the corresponding visualization of object graph abstractions. Red nodes are boundary nodes, and black nodes are center nodes.
  • Figure 5: The real robot experiment consists of five rollouts. Each time at the initial pose, the camera in CURI's head captured an RGB-D image of the dough, then it was converted into a heterogeneous graph through graph abstraction and passed as input to the pre-trained DGform model. The model generated a path with 50 waypoints, which was then transformed into a smooth dual-arm trajectory with 10,000 points by bimanual LQT. The trajectories were executed by the Cartesian motion controller embedded in the CURI. The first row shows the recording from a third perspective, while the last two are observations from the first-person perspective camera.