Table of Contents
Fetching ...

Task-Oriented Dexterous Hand Pose Synthesis Using Differentiable Grasp Wrench Boundary Estimator

Jiayi Chen, Yuxing Chen, Jialiang Zhang, He Wang

TL;DR

This work tackles task-oriented dexterous hand pose synthesis by bridging Task Wrench Space (TWS) and Grasp Wrench Space (GWS) through a differentiable energy, enabling gradient-based optimization to synthesize poses for force-closure, non-force-closure, and non-prehensile tasks without demonstrations. It introduces a fast GWS boundary estimator under the $L_ ty$ bound, a cosine-distance-based task energy, and a CUDA-accelerated synthesis pipeline implemented in cuRobo. Across 10 simulated tasks, it achieves a 72.6% success rate, with real-world validation on 4 tasks, and demonstrates up to ~50x faster force-closure grasp synthesis than DexGraspNet while maintaining comparable quality. The approach scales to large datasets (100k grasps over 5700 objects) in ~1.2 GPU hours, highlighting practical impact for rapid, task-aware manipulation with dexterous hands.

Abstract

This work tackles the problem of task-oriented dexterous hand pose synthesis, which involves generating a static hand pose capable of applying a task-specific set of wrenches to manipulate objects. Unlike previous approaches that focus solely on force-closure grasps, which are unsuitable for non-prehensile manipulation tasks (\textit{e.g.}, turning a knob or pressing a button), we introduce a unified framework covering force-closure grasps, non-force-closure grasps, and a variety of non-prehensile poses. Our key idea is a novel optimization objective quantifying the disparity between the Task Wrench Space (TWS, the desired wrenches predefined as a task prior) and the Grasp Wrench Space (GWS, the achievable wrenches computed from the current hand pose). By minimizing this objective, gradient-based optimization algorithms can synthesize task-oriented hand poses without additional human demonstrations. Our specific contributions include 1) a fast, accurate, and differentiable technique for estimating the GWS boundary; 2) a task-oriented objective function based on the disparity between the estimated GWS boundary and the provided TWS boundary; and 3) an efficient implementation of the synthesis pipeline that leverages CUDA accelerations and supports large-scale paralleling. Experimental results on 10 diverse tasks demonstrate a 72.6\% success rate in simulation. Furthermore, real-world validation for 4 tasks confirms the effectiveness of synthesized poses for manipulation. Notably, despite being primarily tailored for task-oriented hand pose synthesis, our pipeline can generate force-closure grasps 50 times faster than DexGraspNet while maintaining comparable grasp quality. Project page: https://pku-epic.github.io/TaskDexGrasp/.

Task-Oriented Dexterous Hand Pose Synthesis Using Differentiable Grasp Wrench Boundary Estimator

TL;DR

This work tackles task-oriented dexterous hand pose synthesis by bridging Task Wrench Space (TWS) and Grasp Wrench Space (GWS) through a differentiable energy, enabling gradient-based optimization to synthesize poses for force-closure, non-force-closure, and non-prehensile tasks without demonstrations. It introduces a fast GWS boundary estimator under the bound, a cosine-distance-based task energy, and a CUDA-accelerated synthesis pipeline implemented in cuRobo. Across 10 simulated tasks, it achieves a 72.6% success rate, with real-world validation on 4 tasks, and demonstrates up to ~50x faster force-closure grasp synthesis than DexGraspNet while maintaining comparable quality. The approach scales to large datasets (100k grasps over 5700 objects) in ~1.2 GPU hours, highlighting practical impact for rapid, task-aware manipulation with dexterous hands.

Abstract

This work tackles the problem of task-oriented dexterous hand pose synthesis, which involves generating a static hand pose capable of applying a task-specific set of wrenches to manipulate objects. Unlike previous approaches that focus solely on force-closure grasps, which are unsuitable for non-prehensile manipulation tasks (\textit{e.g.}, turning a knob or pressing a button), we introduce a unified framework covering force-closure grasps, non-force-closure grasps, and a variety of non-prehensile poses. Our key idea is a novel optimization objective quantifying the disparity between the Task Wrench Space (TWS, the desired wrenches predefined as a task prior) and the Grasp Wrench Space (GWS, the achievable wrenches computed from the current hand pose). By minimizing this objective, gradient-based optimization algorithms can synthesize task-oriented hand poses without additional human demonstrations. Our specific contributions include 1) a fast, accurate, and differentiable technique for estimating the GWS boundary; 2) a task-oriented objective function based on the disparity between the estimated GWS boundary and the provided TWS boundary; and 3) an efficient implementation of the synthesis pipeline that leverages CUDA accelerations and supports large-scale paralleling. Experimental results on 10 diverse tasks demonstrate a 72.6\% success rate in simulation. Furthermore, real-world validation for 4 tasks confirms the effectiveness of synthesized poses for manipulation. Notably, despite being primarily tailored for task-oriented hand pose synthesis, our pipeline can generate force-closure grasps 50 times faster than DexGraspNet while maintaining comparable grasp quality. Project page: https://pku-epic.github.io/TaskDexGrasp/.
Paper Structure (16 sections, 11 equations, 7 figures, 5 tables)

This paper contains 16 sections, 11 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Task-oriented dexterous hand poses synthesized for 3 tasks. Each TWS is specified by humans as a task prior: (1) the lift task requires upper forces, (2) the screw task requires counter-clockwise torques, and (3) the force-closure task requires wrenches in all directions. The GWS is computed based on the contacts between the hand and the object. Both TWS and GWS are visualized by their 3D force or torque components.
  • Figure 2: Method illustration. (1) The support mapping $s_{\mathcal{A}}(\mathbf{u})$ and its Prop. \ref{['prop_add']}. (2) $s_{\mathcal{F}}(\mathbf{u})$ for PCF contact model in 3D.
  • Figure 3: GWS estimation visualized in 3D force space. (1) In this example, GWS is the Minkowski sum of two cones. (2,3,4) Points (white) are sampled on GWB (green) by different methods. (4) Our method with approximation can get dense samples.
  • Figure 4: Task-oriented energy. (1) TWS is formulated as a 6D hyper-spherical sector parametrized by a 6D unit vector $\mathbf{w}_t$ and an angle $\gamma$. (2) The task-oriented energy is the sum of the cosine distance between $s_{\mathcal{W}_t}(\mathbf{u}_k)$ and $s_{\mathcal{W}_g}(\mathbf{u}_k)$.
  • Figure 5: Visualization of synthesized task-oriented poses. We show similar poses for the Shadow hand and the LEAP hand, but different poses for the Allegro hand.
  • ...and 2 more figures

Theorems & Definitions (1)

  • proof