Table of Contents
Fetching ...

Hierarchical Contact-Rich Trajectory Optimization for Multi-Modal Manipulation using Tight Convex Relaxations

Yuki Shirai, Arvind Raghunathan, Devesh K. Jha

TL;DR

This work tackles the challenge of designing dexterous, contact-rich manipulation trajectories by introducing a three-stage hierarchical optimization that jointly reasons about robot and object motion and contact sequences. It combines a MILP-based contact planning stage (C-Opt) with an NLP-based full dynamics stage (Q-Opt), guided by a kinematic stage (K-Opt), and augments bilinear constraint relaxations with a binary-encoded partitioning approach for tighter, scalable solutions. The framework is validated through numerical experiments and hardware demonstrations on a bimanual robot, showing the ability to realize complex multi-contact behaviors such as pivoting and sliding with improved computation and feasibility over traditional relaxations. Key contributions include the elimination of fixed contact modes, a tight convex relaxation via binary encoding, and comprehensive verification across tasks, highlighting practical potential for long-horizon, multi-modal manipulation.

Abstract

Designing trajectories for manipulation through contact is challenging as it requires reasoning of object \& robot trajectories as well as complex contact sequences simultaneously. In this paper, we present a novel framework for simultaneously designing trajectories of robots, objects, and contacts efficiently for contact-rich manipulation. We propose a hierarchical optimization framework where Mixed-Integer Linear Program (MILP) selects optimal contacts between robot \& object using approximate dynamical constraints, and then a NonLinear Program (NLP) optimizes trajectory of the robot(s) and object considering full nonlinear constraints. We present a convex relaxation of bilinear constraints using binary encoding technique such that MILP can provide tighter solutions with better computational complexity. The proposed framework is evaluated on various manipulation tasks where it can reason about complex multi-contact interactions while providing computational advantages. We also demonstrate our framework in hardware experiments using a bimanual robot system. The video summarizing this paper and hardware experiments is found https://youtu.be/s2S1Eg5RsRE?si=chPkftz_a3NAHxLq

Hierarchical Contact-Rich Trajectory Optimization for Multi-Modal Manipulation using Tight Convex Relaxations

TL;DR

This work tackles the challenge of designing dexterous, contact-rich manipulation trajectories by introducing a three-stage hierarchical optimization that jointly reasons about robot and object motion and contact sequences. It combines a MILP-based contact planning stage (C-Opt) with an NLP-based full dynamics stage (Q-Opt), guided by a kinematic stage (K-Opt), and augments bilinear constraint relaxations with a binary-encoded partitioning approach for tighter, scalable solutions. The framework is validated through numerical experiments and hardware demonstrations on a bimanual robot, showing the ability to realize complex multi-contact behaviors such as pivoting and sliding with improved computation and feasibility over traditional relaxations. Key contributions include the elimination of fixed contact modes, a tight convex relaxation via binary encoding, and comprehensive verification across tasks, highlighting practical potential for long-horizon, multi-modal manipulation.

Abstract

Designing trajectories for manipulation through contact is challenging as it requires reasoning of object \& robot trajectories as well as complex contact sequences simultaneously. In this paper, we present a novel framework for simultaneously designing trajectories of robots, objects, and contacts efficiently for contact-rich manipulation. We propose a hierarchical optimization framework where Mixed-Integer Linear Program (MILP) selects optimal contacts between robot \& object using approximate dynamical constraints, and then a NonLinear Program (NLP) optimizes trajectory of the robot(s) and object considering full nonlinear constraints. We present a convex relaxation of bilinear constraints using binary encoding technique such that MILP can provide tighter solutions with better computational complexity. The proposed framework is evaluated on various manipulation tasks where it can reason about complex multi-contact interactions while providing computational advantages. We also demonstrate our framework in hardware experiments using a bimanual robot system. The video summarizing this paper and hardware experiments is found https://youtu.be/s2S1Eg5RsRE?si=chPkftz_a3NAHxLq

Paper Structure

This paper contains 23 sections, 10 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: We show a bimanual system which can reason about pivoting the box to grasp it so that it can be stowed on a shelf using the proposed algorithm. The Apriltag system is used for feedback during grasping since the box might slip during pivoting. The hardware experiment video is found https://youtu.be/s2S1Eg5RsRE?si=g7JP4_0Cchm49c2b.
  • Figure 2: A schematic showing the free-body diagram of a rigid body during manipulation where two robots make contact and there is one extrinsic contact between the object and the environment. We consider $N_r = 2$, $N_v = 6$ and $N_p = 6$ in this figure. The red line represents one specific object's contact surface with the corresponding local force $\lambda$. The variables in this figure are summarized in Table \ref{['tab:my_label']}.
  • Figure 3: Overview of the proposed framework. See Sec \ref{['overview_sec']} for details. Orange points and green points represent the robot contact location and extrinsic contact location, respectively. Red lines represent the object's contact surface where the robot makes contact. We have the cutting plane method to deal with the infeasible solution by Q-Opt (see Sec \ref{['sec:feedback_cutting']}).
  • Figure 4: Trajectories of pivoting manipulation from Q-Opt. We do not let one of the arms make contact if $q_y \geq -0.05m$, denoted as blue arrows. We consider $T=150$ but only plot 15 snapshots with no gravity arrows (aligned along $y$-axis.), for clarity. The bold green line represents the goal state. The orange line shows $R$ at $t = 0$ (see Sec \ref{['sec:main_result']}).
  • Figure 5: Trajectories for sliding manipulation to rotate the object by 45°. Because Q-Opt can consider full nonlinear dynamics, it could achieve better motion than C-Opt. Note that we consider $T=50$ but only plot 10 snapshots for clarity. The bold green line represents the goal state.
  • ...and 2 more figures