Hierarchical Contact-Rich Trajectory Optimization for Multi-Modal Manipulation using Tight Convex Relaxations
Yuki Shirai, Arvind Raghunathan, Devesh K. Jha
TL;DR
This work tackles the challenge of designing dexterous, contact-rich manipulation trajectories by introducing a three-stage hierarchical optimization that jointly reasons about robot and object motion and contact sequences. It combines a MILP-based contact planning stage (C-Opt) with an NLP-based full dynamics stage (Q-Opt), guided by a kinematic stage (K-Opt), and augments bilinear constraint relaxations with a binary-encoded partitioning approach for tighter, scalable solutions. The framework is validated through numerical experiments and hardware demonstrations on a bimanual robot, showing the ability to realize complex multi-contact behaviors such as pivoting and sliding with improved computation and feasibility over traditional relaxations. Key contributions include the elimination of fixed contact modes, a tight convex relaxation via binary encoding, and comprehensive verification across tasks, highlighting practical potential for long-horizon, multi-modal manipulation.
Abstract
Designing trajectories for manipulation through contact is challenging as it requires reasoning of object \& robot trajectories as well as complex contact sequences simultaneously. In this paper, we present a novel framework for simultaneously designing trajectories of robots, objects, and contacts efficiently for contact-rich manipulation. We propose a hierarchical optimization framework where Mixed-Integer Linear Program (MILP) selects optimal contacts between robot \& object using approximate dynamical constraints, and then a NonLinear Program (NLP) optimizes trajectory of the robot(s) and object considering full nonlinear constraints. We present a convex relaxation of bilinear constraints using binary encoding technique such that MILP can provide tighter solutions with better computational complexity. The proposed framework is evaluated on various manipulation tasks where it can reason about complex multi-contact interactions while providing computational advantages. We also demonstrate our framework in hardware experiments using a bimanual robot system. The video summarizing this paper and hardware experiments is found https://youtu.be/s2S1Eg5RsRE?si=chPkftz_a3NAHxLq
