Sampling-Based Optimization with Parallelized Physics Simulator for Bimanual Manipulation
Iryna Hurova, Alinjar Dan, Karl Kruusamäe, Arun Kumar Singh
TL;DR
Addressing the brittleness of purely learning-based policies in cluttered, contact-rich bimanual manipulation, the paper advocates a sampling-based planning paradigm using a GPU-accelerated MuJoCo world model. It introduces a customized MPPI controller with an embedded QP for smooth, jerk-bounded trajectories and evaluates thousands of parallel rollouts guided by task-specific costs. The work demonstrates new, harder variants of PerAct^2 tasks—such as moving a ball through obstacles—and reports real-time performance on commodity GPUs with promising sim-to-real transfer. A statistical analysis characterizes sample complexity and robustness, underscoring the practical viability of model-based planning for complex bimanual manipulation.
Abstract
In recent years, dual-arm manipulation has become an area of strong interest in robotics, with end-to-end learning emerging as the predominant strategy for solving bimanual tasks. A critical limitation of such learning-based approaches, however, is their difficulty in generalizing to novel scenarios, especially within cluttered environments. This paper presents an alternative paradigm: a sampling-based optimization framework that utilizes a GPU-accelerated physics simulator as its world model. We demonstrate that this approach can solve complex bimanual manipulation tasks in the presence of static obstacles. Our contribution is a customized Model Predictive Path Integral Control (MPPI) algorithm, \textbf{guided by carefully designed task-specific cost functions,} that uses GPU-accelerated MuJoCo for efficiently evaluating robot-object interaction. We apply this method to solve significantly more challenging versions of tasks from the PerAct$^{2}$ benchmark, such as requiring the point-to-point transfer of a ball through an obstacle course. Furthermore, we establish that our method achieves real-time performance on commodity GPUs and facilitates successful sim-to-real transfer by leveraging unique features within MuJoCo. The paper concludes with a statistical analysis of the sample complexity and robustness, quantifying the performance of our approach. The project website is available at: https://sites.google.com/view/bimanualakslabunitartu .
