MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
Wayne Wu, Honglin He, Jack He, Yiran Wang, Chenda Duan, Zhizheng Liu, Quanyi Li, Bolei Zhou
TL;DR
MetaUrban tackles the need for scalable, safe AI in urban micromobility by introducing a compositional simulator that can generate infinite urban scenes with rich semantics, diverse agents, and realistic dynamics. Its three core modules—Hierarchical Layout Generation, Scalable Obstacle Retrieval, and Cohabitant Populating—coupled with the MetaUrban-12K dataset, provide a rigorous platform for evaluating PointNav and SocialNav under varied geometries, terrains, and pedestrian interactions. The results show strong generalization to unseen environments and highlight the balance between safety and performance, while cross-machine analyses reveal how mechanical design shapes policy learning. By open-sourcing the platform and dataset, MetaUrban aims to accelerate research on safe, trustworthy embodied AI for urban micromobility and inform future urban planning and robotics deployment.
Abstract
Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while robot dogs and humanoids have recently emerged in the street. Micromobility enabled by AI for short-distance travel in public urban spaces plays a crucial component in the future transportation system. Ensuring the generalizability and safety of AI models maneuvering mobile machines is essential. In this work, we present MetaUrban, a compositional simulation platform for the AI-driven urban micromobility research. MetaUrban can construct an infinite number of interactive urban scenes from compositional elements, covering a vast array of ground plans, object placements, pedestrians, vulnerable road users, and other mobile agents' appearances and dynamics. We design point navigation and social navigation tasks as the pilot study using MetaUrban for urban micromobility research and establish various baselines of Reinforcement Learning and Imitation Learning. We conduct extensive evaluation across mobile machines, demonstrating that heterogeneous mechanical structures significantly influence the learning and execution of AI policies. We perform a thorough ablation study, showing that the compositional nature of the simulated environments can substantially improve the generalizability and safety of the trained mobile agents. MetaUrban will be made publicly available to provide research opportunities and foster safe and trustworthy embodied AI and micromobility in cities. The code and dataset will be publicly available.
