Safe mobility support system using crowd mapping and avoidance route planning using VLM
Sena Saito, Kenta Tabata, Renato Miyagusuku, Koichi Ozaki
TL;DR
This work addresses safe navigation of autonomous robots in crowded, dynamic environments by introducing an Abstraction Map Generator (AMG) that uses Vision-Language Models to detect abstract concepts like crowds and Gaussian Process Regression to convert these detections into a probabilistic crowd-density map. The crowd map is fused with a geometric map to produce a multi-layer cost surface, enabling Dijkstra-based path planning that avoids both static obstacles and crowds. Real-world campus experiments validate that AMG can generate usable cost maps and yield paths that circumvent crowds, while highlighting challenges in detection stability and the need for appropriate weight balancing. The study demonstrates a practical framework that bridges abstract scene understanding with geometric planning, with potential extensions to multi-layer environmental factors for enhanced robustness in real-time robotic navigation.
Abstract
Autonomous mobile robots offer promising solutions for labor shortages and increased operational efficiency. However, navigating safely and effectively in dynamic environments, particularly crowded areas, remains challenging. This paper proposes a novel framework that integrates Vision-Language Models (VLM) and Gaussian Process Regression (GPR) to generate dynamic crowd-density maps (``Abstraction Maps'') for autonomous robot navigation. Our approach utilizes VLM's capability to recognize abstract environmental concepts, such as crowd densities, and represents them probabilistically via GPR. Experimental results from real-world trials on a university campus demonstrated that robots successfully generated routes avoiding both static obstacles and dynamic crowds, enhancing navigation safety and adaptability.
