MeSA-DRL: Memory-Enhanced Deep Reinforcement Learning for Advanced Socially Aware Robot Navigation in Crowded Environments
Mannan Saeed Muhammad, Estrella Montero
TL;DR
MeSA-DRL introduces memory-enhanced deep reinforcement learning for socially aware robot navigation in crowds, leveraging GRU-based memory to retain pedestrian context and an attention mechanism to prioritize human-robot interactions. The approach integrates a global planning layer and a multi-term reward that includes dynamic warning zones to promote safe, efficient trajectories, reducing freezing and improving robustness in dense crowds. Empirical results in simulations and real-world Turtlebot experiments show MeSA-DRL achieving higher success rates, lower collision rates, and shorter path lengths and traversal times than CADRL, LSTM-RL, SARL, and CAM-RL, with statistical significance. The work demonstrates practical impact for service robots requiring reliable, socially compliant navigation in real environments, enabling faster deployment without retraining in new crowds.
Abstract
Autonomous navigation capabilities play a critical role in service robots operating in environments where human interactions are pivotal, due to the dynamic and unpredictable nature of these environments. However, the variability in human behavior presents a substantial challenge for robots in predicting and anticipating movements, particularly in crowded scenarios. To address this issue, a memory-enabled deep reinforcement learning framework is proposed for autonomous robot navigation in diverse pedestrian scenarios. The proposed framework leverages long-term memory to retain essential information about the surroundings and model sequential dependencies effectively. The importance of human-robot interactions is also encoded to assign higher attention to these interactions. A global planning mechanism is incorporated into the memory-enabled architecture. Additionally, a multi-term reward system is designed to prioritize and encourage long-sighted robot behaviors by incorporating dynamic warning zones. Simultaneously, it promotes smooth trajectories and minimizes the time taken to reach the robot's desired goal. Extensive simulation experiments show that the suggested approach outperforms representative state-of-the-art methods, showcasing its ability to a navigation efficiency and safety in real-world scenarios.
