Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization
Albert Wu, Michelle Guo, C. Karen Liu
TL;DR
The paper tackles the problem of generating diverse, physically feasible dexterous grasps for novel objects. It proposes a hybrid pipeline that first predicts finger placements from object point clouds with a conditional variational autoencoder (CVAE) and then refines the prediction via a bilevel optimization (BO) that enforces wrench closure, friction cone, reachability, and collision constraints. The method is validated on real hardware using an Allegro hand mounted on a Panda arm, achieving 86.7% success across 20 household objects and demonstrating constraint satisfaction guarantees through quantitative metrics. The work demonstrates that integrating learning with physics-informed optimization yields robust, diverse grasp configurations with practical relevance for dexterous manipulation.
Abstract
To fully utilize the versatility of a multi-fingered dexterous robotic hand for executing diverse object grasps, one must consider the rich physical constraints introduced by hand-object interaction and object geometry. We propose an integrative approach of combining a generative model and a bilevel optimization (BO) to plan diverse grasp configurations on novel objects. First, a conditional variational autoencoder trained on merely six YCB objects predicts the finger placement directly from the object point cloud. The prediction is then used to seed a nonconvex BO that solves for a grasp configuration under collision, reachability, wrench closure, and friction constraints. Our method achieved an 86.7% success over 120 real world grasping trials on 20 household objects, including unseen and challenging geometries. Through quantitative empirical evaluations, we confirm that grasp configurations produced by our pipeline are indeed guaranteed to satisfy kinematic and dynamic constraints. A video summary of our results is available at youtu.be/9DTrImbN99I.
