Task-Oriented Dexterous Hand Pose Synthesis Using Differentiable Grasp Wrench Boundary Estimator
Jiayi Chen, Yuxing Chen, Jialiang Zhang, He Wang
TL;DR
This work tackles task-oriented dexterous hand pose synthesis by bridging Task Wrench Space (TWS) and Grasp Wrench Space (GWS) through a differentiable energy, enabling gradient-based optimization to synthesize poses for force-closure, non-force-closure, and non-prehensile tasks without demonstrations. It introduces a fast GWS boundary estimator under the $L_ty$ bound, a cosine-distance-based task energy, and a CUDA-accelerated synthesis pipeline implemented in cuRobo. Across 10 simulated tasks, it achieves a 72.6% success rate, with real-world validation on 4 tasks, and demonstrates up to ~50x faster force-closure grasp synthesis than DexGraspNet while maintaining comparable quality. The approach scales to large datasets (100k grasps over 5700 objects) in ~1.2 GPU hours, highlighting practical impact for rapid, task-aware manipulation with dexterous hands.
Abstract
This work tackles the problem of task-oriented dexterous hand pose synthesis, which involves generating a static hand pose capable of applying a task-specific set of wrenches to manipulate objects. Unlike previous approaches that focus solely on force-closure grasps, which are unsuitable for non-prehensile manipulation tasks (\textit{e.g.}, turning a knob or pressing a button), we introduce a unified framework covering force-closure grasps, non-force-closure grasps, and a variety of non-prehensile poses. Our key idea is a novel optimization objective quantifying the disparity between the Task Wrench Space (TWS, the desired wrenches predefined as a task prior) and the Grasp Wrench Space (GWS, the achievable wrenches computed from the current hand pose). By minimizing this objective, gradient-based optimization algorithms can synthesize task-oriented hand poses without additional human demonstrations. Our specific contributions include 1) a fast, accurate, and differentiable technique for estimating the GWS boundary; 2) a task-oriented objective function based on the disparity between the estimated GWS boundary and the provided TWS boundary; and 3) an efficient implementation of the synthesis pipeline that leverages CUDA accelerations and supports large-scale paralleling. Experimental results on 10 diverse tasks demonstrate a 72.6\% success rate in simulation. Furthermore, real-world validation for 4 tasks confirms the effectiveness of synthesized poses for manipulation. Notably, despite being primarily tailored for task-oriented hand pose synthesis, our pipeline can generate force-closure grasps 50 times faster than DexGraspNet while maintaining comparable grasp quality. Project page: https://pku-epic.github.io/TaskDexGrasp/.
