M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes
Zeyu Zhang, Sixu Yan, Muzhi Han, Zaijin Wang, Xinggang Wang, Song-Chun Zhu, Hangxin Liu
TL;DR
This work introduces M3Bench, a large-scale benchmark for whole-body motion generation in mobile manipulation within 3D scenes, and M3BenchMaker, an automatic data-generation tool that constructs coordinated base–arm trajectories from high-level task instructions. The framework assembles Task Builder, Conditional Scene Sampler, Goal Configuration Generator, and VKC Problem Generator to produce feasible demonstrations validated in Isaac Sim, across 119 scenes and 32 object types. Through extensive experiments comparing planning-based and learning-based methods, the authors show persistent challenges in achieving robust base–arm coordination under environmental constraints, with hybrid approaches offering limited gains. The contributions provide a scalable platform and data-generation capability to advance embodied AI toward more adaptive mobile manipulation in realistic environments.
Abstract
We propose M3Bench, a new benchmark for whole-body motion generation in mobile manipulation tasks. Given a 3D scene context, M3Bench requires an embodied agent to reason about its configuration, environmental constraints, and task objectives to generate coordinated whole-body motion trajectories for object rearrangement. M3Bench features 30,000 object rearrangement tasks across 119 diverse scenes, providing expert demonstrations generated by our newly developed M3BenchMaker, an automatic data generation tool that produces whole-body motion trajectories from high-level task instructions using only basic scene and robot information. Our benchmark includes various task splits to evaluate generalization across different dimensions and leverages realistic physics simulation for trajectory assessment. Extensive evaluation analysis reveals that state-of-the-art models struggle with coordinating base-arm motion while adhering to environmental and task-specific constraints, underscoring the need for new models to bridge this gap. By releasing M3Bench and M3BenchMaker we aim to advance robotics research toward more adaptive and capable mobile manipulation in diverse, real-world environments.
