Table of Contents
Fetching ...

Sampling-Based Optimization with Parallelized Physics Simulator for Bimanual Manipulation

Iryna Hurova, Alinjar Dan, Karl Kruusamäe, Arun Kumar Singh

TL;DR

Addressing the brittleness of purely learning-based policies in cluttered, contact-rich bimanual manipulation, the paper advocates a sampling-based planning paradigm using a GPU-accelerated MuJoCo world model. It introduces a customized MPPI controller with an embedded QP for smooth, jerk-bounded trajectories and evaluates thousands of parallel rollouts guided by task-specific costs. The work demonstrates new, harder variants of PerAct^2 tasks—such as moving a ball through obstacles—and reports real-time performance on commodity GPUs with promising sim-to-real transfer. A statistical analysis characterizes sample complexity and robustness, underscoring the practical viability of model-based planning for complex bimanual manipulation.

Abstract

In recent years, dual-arm manipulation has become an area of strong interest in robotics, with end-to-end learning emerging as the predominant strategy for solving bimanual tasks. A critical limitation of such learning-based approaches, however, is their difficulty in generalizing to novel scenarios, especially within cluttered environments. This paper presents an alternative paradigm: a sampling-based optimization framework that utilizes a GPU-accelerated physics simulator as its world model. We demonstrate that this approach can solve complex bimanual manipulation tasks in the presence of static obstacles. Our contribution is a customized Model Predictive Path Integral Control (MPPI) algorithm, \textbf{guided by carefully designed task-specific cost functions,} that uses GPU-accelerated MuJoCo for efficiently evaluating robot-object interaction. We apply this method to solve significantly more challenging versions of tasks from the PerAct$^{2}$ benchmark, such as requiring the point-to-point transfer of a ball through an obstacle course. Furthermore, we establish that our method achieves real-time performance on commodity GPUs and facilitates successful sim-to-real transfer by leveraging unique features within MuJoCo. The paper concludes with a statistical analysis of the sample complexity and robustness, quantifying the performance of our approach. The project website is available at: https://sites.google.com/view/bimanualakslabunitartu .

Sampling-Based Optimization with Parallelized Physics Simulator for Bimanual Manipulation

TL;DR

Addressing the brittleness of purely learning-based policies in cluttered, contact-rich bimanual manipulation, the paper advocates a sampling-based planning paradigm using a GPU-accelerated MuJoCo world model. It introduces a customized MPPI controller with an embedded QP for smooth, jerk-bounded trajectories and evaluates thousands of parallel rollouts guided by task-specific costs. The work demonstrates new, harder variants of PerAct^2 tasks—such as moving a ball through obstacles—and reports real-time performance on commodity GPUs with promising sim-to-real transfer. A statistical analysis characterizes sample complexity and robustness, underscoring the practical viability of model-based planning for complex bimanual manipulation.

Abstract

In recent years, dual-arm manipulation has become an area of strong interest in robotics, with end-to-end learning emerging as the predominant strategy for solving bimanual tasks. A critical limitation of such learning-based approaches, however, is their difficulty in generalizing to novel scenarios, especially within cluttered environments. This paper presents an alternative paradigm: a sampling-based optimization framework that utilizes a GPU-accelerated physics simulator as its world model. We demonstrate that this approach can solve complex bimanual manipulation tasks in the presence of static obstacles. Our contribution is a customized Model Predictive Path Integral Control (MPPI) algorithm, \textbf{guided by carefully designed task-specific cost functions,} that uses GPU-accelerated MuJoCo for efficiently evaluating robot-object interaction. We apply this method to solve significantly more challenging versions of tasks from the PerAct benchmark, such as requiring the point-to-point transfer of a ball through an obstacle course. Furthermore, we establish that our method achieves real-time performance on commodity GPUs and facilitates successful sim-to-real transfer by leveraging unique features within MuJoCo. The paper concludes with a statistical analysis of the sample complexity and robustness, quantifying the performance of our approach. The project website is available at: https://sites.google.com/view/bimanualakslabunitartu .

Paper Structure

This paper contains 16 sections, 29 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Comparison of our approach (top) with gkanatsios20253d (bottom). We consider the task of lifting the ball made complicated by the presence of an obstacle. In free space, both our approach and gkanatsios20253d work flawlessly. However, gkanatsios20253d is not able to adapt the lifting strategy in the presence of obstacles.
  • Figure 2: Manipulator Arms in global frame.
  • Figure 3: Experimental Setup wherein real-world and simulation are tightly coupled with OptiTrack motion capture.
  • Figure 4: Snapshot of all tasks. Top row: Manipulators lifting a ball while avoiding obstacles. Middle row: Manipulators picking a tray and placing it between two wall-like obstacles. Last row: Manipulators performing hand-over tasks amidst static obstacles. Note that the viewing angle for each snapshot is different. Refer to the accompanying videos for further details.
  • Figure 5: Statistics of all tasks. (a) Success rate, (b) task time, and (c) computation time with respect to different batch sizes used in Alg.\ref{['algo_1']}.