RUMOR: Reinforcement learning for Understanding a Model of the Real World for Navigation in Dynamic Environments
Diego Martinez-Baselga, Luis Riazuelo, Luis Montano
TL;DR
RUMOR addresses autonomous navigation in highly dynamic environments by integrating a model-based environmental abstraction, Dynamic Object Velocity Space ($DOVS$), with a deep reinforcement learning controller (Soft Actor-Critic) that operates over a kinodynamics-aware continuous action space. This fusion enables the robot to interpret complex dynamic scenes through a robocentric velocity representation while ensuring produced commands respect differential-drive kinematics. Key contributions include the $DOVS$ formulation (combining Dynamic Object Velocities and Free Velocities over a horizon $T_h$), a two-stream encoder with an LSTM for robust occlusion handling, and a training setup that leverages realistic simulation to reduce sim-to-real gaps; real-world tests with pedestrians demonstrate transferability. The results show NR-RUMOR achieving higher success rates and competitive or superior navigation times compared to a broad set of baselines, highlighting the practical impact for dense, dynamic, and unseen environments. Overall, the work advances dynamic-enabled, kinodynamics-aware planning by embedding robust environmental abstractions into a DRL framework, enabling smoother, safer real-world navigation for differential-drive robots.
Abstract
Autonomous navigation in dynamic environments is a complex but essential task for autonomous robots, with recent deep reinforcement learning approaches showing promising results. However, the complexity of the real world makes it infeasible to train agents in every possible scenario configuration. Moreover, existing methods typically overlook factors such as robot kinodynamic constraints, or assume perfect knowledge of the environment. In this work, we present RUMOR, a novel planner for differential-drive robots that uses deep reinforcement learning to navigate in highly dynamic environments. Unlike other end-to-end DRL planners, it uses a descriptive robocentric velocity space model to extract the dynamic environment information, enhancing training effectiveness and scenario interpretation. Additionally, we propose an action space that inherently considers robot kinodynamics and train it in a simulator that reproduces the real world problematic aspects, reducing the gap between the reality and simulation. We extensively compare RUMOR with other state-of-the-art approaches, demonstrating a better performance, and provide a detailed analysis of the results. Finally, we validate RUMOR's performance in real-world settings by deploying it on a ground robot. Our experiments, conducted in crowded scenarios and unseen environments, confirm the algorithm's robustness and transferability.
