HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation
An Dinh Vuong, Toan Tien Nguyen, Minh Nhat VU, Baoru Huang, Dzung Nguyen, Huynh Thi Thanh Binh, Thieu Vo, Anh Nguyen
TL;DR
HabiCrowd targets a core gap in embodied AI by introducing a high-performance crowd-aware visual navigation benchmark built on Habitat 2.0, integrating a continuous human dynamics model into 3D photorealistic HM3D scenes. The framework (UPL++) delivers collision-free crowd navigation with substantially better computational efficiency than existing 3D simulators, while enabling large-scale studies of human density and human–robot interactions. The paper provides a 480-scene HM3D-based dataset with 40 virtual humans, a thorough evaluation of rendering speed and memory, and two crowd-aware navigation tasks that reveal the impact of density and reward shaping on performance. This work advances practical sim-to-real research by offering a scalable, realistic platform for training and evaluating agents in dynamic, human-rich environments.
Abstract
Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years. While many 3D simulators have been introduced to support visual navigation tasks, scarcely works have been directed towards combining human dynamics, creating the gap between simulation and real-world applications. Furthermore, current 3D simulators incorporating human dynamics have several limitations, particularly in terms of computational efficiency, which is a promise of E-AI simulators. To overcome these shortcomings, we introduce HabiCrowd, the first standard benchmark for crowd-aware visual navigation that integrates a crowd dynamics model with diverse human settings into photorealistic environments. Empirical evaluations demonstrate that our proposed human dynamics model achieves state-of-the-art performance in collision avoidance, while exhibiting superior computational efficiency compared to its counterparts. We leverage HabiCrowd to conduct several comprehensive studies on crowd-aware visual navigation tasks and human-robot interactions. The source code and data can be found at https://habicrowd.github.io/.
