Table of Contents
Fetching ...

HiCrowd: Hierarchical Crowd Flow Alignment for Dense Human Environments

Yufei Zhu, Shih-Min Yang, Martin Magnusson, Allan Wang

TL;DR

The paper tackles dense-crowd navigation by addressing the freezing robot problem with HiCrowd, a hierarchical RL–MPC framework that uses nearby crowd flow as guidance. The high-level RL policy outputs a follow point to align the robot with a suitable pedestrian group, while the low-level MPC ensures safe, short-horizon trajectory tracking. Key contributions include a crowd-following reward that accelerates learning, integration of online crowd flow into planning, and successful real-world deployment demonstrating socially aware behavior. The results show HiCrowd outperforms reactive and learning-based baselines in both offline and online settings, reducing freezing and improving navigation efficiency, with practical implications for safe autonomous operation in crowded environments.

Abstract

Navigating through dense human crowds remains a significant challenge for mobile robots. A key issue is the freezing robot problem, where the robot struggles to find safe motions and becomes stuck within the crowd. To address this, we propose HiCrowd, a hierarchical framework that integrates reinforcement learning (RL) with model predictive control (MPC). HiCrowd leverages surrounding pedestrian motion as guidance, enabling the robot to align with compatible crowd flows. A high-level RL policy generates a follow point to align the robot with a suitable pedestrian group, while a low-level MPC safely tracks this guidance with short horizon planning. The method combines long-term crowd aware decision making with safe short-term execution. We evaluate HiCrowd against reactive and learning-based baselines in offline setting (replaying recorded human trajectories) and online setting (human trajectories are updated to react to the robot in simulation). Experiments on a real-world dataset and a synthetic crowd dataset show that our method outperforms in navigation efficiency and safety, while reducing freezing behaviors. Our results suggest that leveraging human motion as guidance, rather than treating humans solely as dynamic obstacles, provides a powerful principle for safe and efficient robot navigation in crowds.

HiCrowd: Hierarchical Crowd Flow Alignment for Dense Human Environments

TL;DR

The paper tackles dense-crowd navigation by addressing the freezing robot problem with HiCrowd, a hierarchical RL–MPC framework that uses nearby crowd flow as guidance. The high-level RL policy outputs a follow point to align the robot with a suitable pedestrian group, while the low-level MPC ensures safe, short-horizon trajectory tracking. Key contributions include a crowd-following reward that accelerates learning, integration of online crowd flow into planning, and successful real-world deployment demonstrating socially aware behavior. The results show HiCrowd outperforms reactive and learning-based baselines in both offline and online settings, reducing freezing and improving navigation efficiency, with practical implications for safe autonomous operation in crowded environments.

Abstract

Navigating through dense human crowds remains a significant challenge for mobile robots. A key issue is the freezing robot problem, where the robot struggles to find safe motions and becomes stuck within the crowd. To address this, we propose HiCrowd, a hierarchical framework that integrates reinforcement learning (RL) with model predictive control (MPC). HiCrowd leverages surrounding pedestrian motion as guidance, enabling the robot to align with compatible crowd flows. A high-level RL policy generates a follow point to align the robot with a suitable pedestrian group, while a low-level MPC safely tracks this guidance with short horizon planning. The method combines long-term crowd aware decision making with safe short-term execution. We evaluate HiCrowd against reactive and learning-based baselines in offline setting (replaying recorded human trajectories) and online setting (human trajectories are updated to react to the robot in simulation). Experiments on a real-world dataset and a synthetic crowd dataset show that our method outperforms in navigation efficiency and safety, while reducing freezing behaviors. Our results suggest that leveraging human motion as guidance, rather than treating humans solely as dynamic obstacles, provides a powerful principle for safe and efficient robot navigation in crowds.
Paper Structure (18 sections, 2 equations, 6 figures, 4 tables)

This paper contains 18 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Example of HiCrowd navigating in a real world dense crowd environment. The mobile robot platform is marked with an orange circle. Pedestrians moving in a similar direction are shown in cyan, while those walking in the opposite direction are shown in magenta. At $t=4$, the robot aligns with the pedestrian flow in front and follows it to the right, without yet observing the opposing group that appears later. When the oncoming pedestrians (magenta) appear at $t=7$, the robot, already aligned with the group ahead, smoothly follows the flow to the right and avoids collision. By $t=12$ and $t=17$, the robot maintains alignment with the surrounding flow and avoids another group, achieving safe and efficient progress.
  • Figure 2: Method overview. The robot observes nearby humans within its sensing radius $r_\mathrm{obs}$ along with its own state $s$ and goal $g$. A high-level RL policy outputs a follow point $(f_x,f_y)$ that guides the robot to align with a suitable crowd flow. The low-level MPC controller then generates a control action $u$ to track this follow point while ensuring collision avoidance and dynamic feasibility. The RL policy is updated based on rewards that combine goal reaching, progress, and crowd following.
  • Figure 3: Example episode from the ETH-UCY dataset in the offline setting ($t$ in seconds). This case involves a dense crowd with both moving and static groups. Groups are visualized as convex hulls. Robot observation range is marked as a grey circle. Baseline methods take longer paths and more time to reach the goal, with multiple reactive avoidance actions. ORCA gets stuck near dense crowds. CrowdAttn freezes several times. And ORCA, SARL and CrowdAttn result in collisions. In contrast, HiCrowd successfully reaches the goal with a shorter navigation time.
  • Figure 4: Example episode from the Synthetic dataset in the online setting ($t$ in seconds). The scenario contains two opposing pedestrian flows: one moving left to right and the other right to left. The robot starts on the left and aims for a goal on the right, located in the upper region where pedestrians move in the opposite direction. Taking the shortest direct path will lead the robot into the dense opposing flow, which is out of the robot's observation radius at first. MPC and ORCA exhibit this behavior, becoming stuck in the oncoming flow and requiring much longer time to reach the goal. SARL avoids the dense crowd entirely, taking a long detour. CrowdAttn follows a straight path and later runs into the opposing flow, forcing the oncoming social groups to split apart for avoidance. This results in multiple reactive actions and periods of freezing. In contrast, HiCrowd moves align with the pedestrian flow in the lower part of the scene, taking a detour but moving with the crowd, and reaches the goal in the shortest time.
  • Figure 5: Real-world experiment setup and examples. (a) Our differential-drive robot and its sensor configuration. (b) Illustration of the footprint of our robot. It is a circle of 0.5m radius that encompasses the robot and the predefined nominal human offset. (c) Qualitative examples. Robot movement is shown in yellow arrows. Different colors represent different groups of surrounding pedestrians. Top row: With the flow scenario. The robot merged into the pedestrian flow formed by the cyan and blue groups and naturally avoided the magenta group. Middle row: Against the flow scenario. The robot navigated successfully against the pedestrian flow, which consists of at least 3 groups. Bottom row: Crossing the flow scenario. The robot navigated through a crossing flow and reached to the goal successfully without collision.
  • ...and 1 more figures