Table of Contents
Fetching ...

Dexterous Manipulation Based on Prior Dexterous Grasp Pose Knowledge

Hengxu Yan, Haoshu Fang, Cewu Lu

TL;DR

This work tackles dexterous manipulation by integrating prior dexterous grasp pose knowledge into a two-phase framework: first generate an initial grasp pose targeting the object's functional part using segmentation and anygrasp-based proposals, then refine the grasp through PPO-based reinforcement learning with partial-view observations. The method decomposes rewards into interaction, completion, and restriction components to guide safe and efficient manipulation, and maps two-finger grasps to a full dexterous hand, enabling realistic control of a high-DoF system. Extensive simulation on four tasks and real-world tests demonstrate substantial gains in learning efficiency and success rates over baselines, with successful transfer across robotic platforms. The results suggest that leveraging structured prior knowledge can dramatically improve sample efficiency and robustness in complex dexterous manipulation, paving the way for practical deployment in varied environments.

Abstract

Dexterous manipulation has received considerable attention in recent research. Predominantly, existing studies have concentrated on reinforcement learning methods to address the substantial degrees of freedom in hand movements. Nonetheless, these methods typically suffer from low efficiency and accuracy. In this work, we introduce a novel reinforcement learning approach that leverages prior dexterous grasp pose knowledge to enhance both efficiency and accuracy. Unlike previous work, they always make the robotic hand go with a fixed dexterous grasp pose, We decouple the manipulation process into two distinct phases: initially, we generate a dexterous grasp pose targeting the functional part of the object; after that, we employ reinforcement learning to comprehensively explore the environment. Our findings suggest that the majority of learning time is expended in identifying the appropriate initial position and selecting the optimal manipulation viewpoint. Experimental results demonstrate significant improvements in learning efficiency and success rates across four distinct tasks.

Dexterous Manipulation Based on Prior Dexterous Grasp Pose Knowledge

TL;DR

This work tackles dexterous manipulation by integrating prior dexterous grasp pose knowledge into a two-phase framework: first generate an initial grasp pose targeting the object's functional part using segmentation and anygrasp-based proposals, then refine the grasp through PPO-based reinforcement learning with partial-view observations. The method decomposes rewards into interaction, completion, and restriction components to guide safe and efficient manipulation, and maps two-finger grasps to a full dexterous hand, enabling realistic control of a high-DoF system. Extensive simulation on four tasks and real-world tests demonstrate substantial gains in learning efficiency and success rates over baselines, with successful transfer across robotic platforms. The results suggest that leveraging structured prior knowledge can dramatically improve sample efficiency and robustness in complex dexterous manipulation, paving the way for practical deployment in varied environments.

Abstract

Dexterous manipulation has received considerable attention in recent research. Predominantly, existing studies have concentrated on reinforcement learning methods to address the substantial degrees of freedom in hand movements. Nonetheless, these methods typically suffer from low efficiency and accuracy. In this work, we introduce a novel reinforcement learning approach that leverages prior dexterous grasp pose knowledge to enhance both efficiency and accuracy. Unlike previous work, they always make the robotic hand go with a fixed dexterous grasp pose, We decouple the manipulation process into two distinct phases: initially, we generate a dexterous grasp pose targeting the functional part of the object; after that, we employ reinforcement learning to comprehensively explore the environment. Our findings suggest that the majority of learning time is expended in identifying the appropriate initial position and selecting the optimal manipulation viewpoint. Experimental results demonstrate significant improvements in learning efficiency and success rates across four distinct tasks.

Paper Structure

This paper contains 19 sections, 11 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: For the tasks of lifting the bucket and opening the laptop, we set the initial dexterous grasp pose to facilitate successful task completion.
  • Figure 2: Illustration of our dexterous manipulation method. We employ PPO to teach the dexterous hand how to manipulate objects based on a dexterous grasp pose. (1) Starting with a partial-view point cloud captured by the initial camera, we use PointNet1 to segment the functional part of the object, which is then used to generate a set of two-finger grasp poses with Anygrasp. These poses are subsequently mapped to a dexterous grasp space, where collision detection is applied to select an appropriate dexterous grasp pose. (2) PointNet2 serves as our backbone to extract features from the partial-view point cloud obtained by the RL camera. The backbone is pre-trained on a segmentation network before being used in RL training.
  • Figure 3: Mapping of the coordinate system from two-finger grasp poses to four grasp types for the dexterous hand.
  • Figure 4: The center of the red ball indicates the position the finger should approach, as shown at the top of the figure. It will adjust as the functional part of the object changes. Below, the illustrations depict the segmentation results for the functional parts across four tasks.
  • Figure 5: Illustration of the success rate as a function of the training process using XArm6 on the simulation test dataset.
  • ...and 5 more figures