Table of Contents
Fetching ...

MeSA-DRL: Memory-Enhanced Deep Reinforcement Learning for Advanced Socially Aware Robot Navigation in Crowded Environments

Mannan Saeed Muhammad, Estrella Montero

TL;DR

MeSA-DRL introduces memory-enhanced deep reinforcement learning for socially aware robot navigation in crowds, leveraging GRU-based memory to retain pedestrian context and an attention mechanism to prioritize human-robot interactions. The approach integrates a global planning layer and a multi-term reward that includes dynamic warning zones to promote safe, efficient trajectories, reducing freezing and improving robustness in dense crowds. Empirical results in simulations and real-world Turtlebot experiments show MeSA-DRL achieving higher success rates, lower collision rates, and shorter path lengths and traversal times than CADRL, LSTM-RL, SARL, and CAM-RL, with statistical significance. The work demonstrates practical impact for service robots requiring reliable, socially compliant navigation in real environments, enabling faster deployment without retraining in new crowds.

Abstract

Autonomous navigation capabilities play a critical role in service robots operating in environments where human interactions are pivotal, due to the dynamic and unpredictable nature of these environments. However, the variability in human behavior presents a substantial challenge for robots in predicting and anticipating movements, particularly in crowded scenarios. To address this issue, a memory-enabled deep reinforcement learning framework is proposed for autonomous robot navigation in diverse pedestrian scenarios. The proposed framework leverages long-term memory to retain essential information about the surroundings and model sequential dependencies effectively. The importance of human-robot interactions is also encoded to assign higher attention to these interactions. A global planning mechanism is incorporated into the memory-enabled architecture. Additionally, a multi-term reward system is designed to prioritize and encourage long-sighted robot behaviors by incorporating dynamic warning zones. Simultaneously, it promotes smooth trajectories and minimizes the time taken to reach the robot's desired goal. Extensive simulation experiments show that the suggested approach outperforms representative state-of-the-art methods, showcasing its ability to a navigation efficiency and safety in real-world scenarios.

MeSA-DRL: Memory-Enhanced Deep Reinforcement Learning for Advanced Socially Aware Robot Navigation in Crowded Environments

TL;DR

MeSA-DRL introduces memory-enhanced deep reinforcement learning for socially aware robot navigation in crowds, leveraging GRU-based memory to retain pedestrian context and an attention mechanism to prioritize human-robot interactions. The approach integrates a global planning layer and a multi-term reward that includes dynamic warning zones to promote safe, efficient trajectories, reducing freezing and improving robustness in dense crowds. Empirical results in simulations and real-world Turtlebot experiments show MeSA-DRL achieving higher success rates, lower collision rates, and shorter path lengths and traversal times than CADRL, LSTM-RL, SARL, and CAM-RL, with statistical significance. The work demonstrates practical impact for service robots requiring reliable, socially compliant navigation in real environments, enabling faster deployment without retraining in new crowds.

Abstract

Autonomous navigation capabilities play a critical role in service robots operating in environments where human interactions are pivotal, due to the dynamic and unpredictable nature of these environments. However, the variability in human behavior presents a substantial challenge for robots in predicting and anticipating movements, particularly in crowded scenarios. To address this issue, a memory-enabled deep reinforcement learning framework is proposed for autonomous robot navigation in diverse pedestrian scenarios. The proposed framework leverages long-term memory to retain essential information about the surroundings and model sequential dependencies effectively. The importance of human-robot interactions is also encoded to assign higher attention to these interactions. A global planning mechanism is incorporated into the memory-enabled architecture. Additionally, a multi-term reward system is designed to prioritize and encourage long-sighted robot behaviors by incorporating dynamic warning zones. Simultaneously, it promotes smooth trajectories and minimizes the time taken to reach the robot's desired goal. Extensive simulation experiments show that the suggested approach outperforms representative state-of-the-art methods, showcasing its ability to a navigation efficiency and safety in real-world scenarios.
Paper Structure (11 sections, 13 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 11 sections, 13 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Schematic of the proposed network model. Pedestrian and robot state vectors are concatenated to form a pairwise combined state vector, with the network outputting policy $\pi$ over possible actions and the value $V$ of the current state.
  • Figure 2: Illustration of the robot path traversals for each method in an obstacle-free space.
  • Figure 3: Spread of path deviation from the straight path at $x=0$ for each method's path traversal.
  • Figure 4: Comparative analysis of local trajectories in a random test episode featuring ten humans within a grouped-human scenario. All experimental conditions maintain identical starting points, goal positions, and time steps.
  • Figure 5: Real world experiments in crowded scenarios.