Learning to navigate efficiently and precisely in real environments
Guillaume Bono, Hervé Poirier, Leonid Antsfeld, Gianluca Monaci, Boris Chidlovskii, Christian Wolf
TL;DR
Problem: closing the sim2real gap for end-to-end navigation policies trained in simulation. Method: introduce a fast, second-order dynamical model of the robot and integrate realistic sensing noise within Habitat, training policies to predict discretized velocity commands that are executed in closed loop. Contributions: integrated motion model, dual localization signals, and thorough real-robot and large-scale simulation evaluations showing substantial gains over prior end-to-end methods. Significance: enables robust, efficient navigation in real environments with minimal sim2real transfer via improved dynamics realism.
Abstract
In the context of autonomous navigation of terrestrial robots, the creation of realistic models for agent dynamics and sensing is a widespread habit in the robotics literature and in commercial applications, where they are used for model based control and/or for localization and mapping. The more recent Embodied AI literature, on the other hand, focuses on modular or end-to-end agents trained in simulators like Habitat or AI-Thor, where the emphasis is put on photo-realistic rendering and scene diversity, but high-fidelity robot motion is assigned a less privileged role. The resulting sim2real gap significantly impacts transfer of the trained models to real robotic platforms. In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap both, in sensing and in actuation. Our agent directly predicts (discretized) velocity commands, which are maintained through closed-loop control in the real robot. The behavior of the real robot (including the underlying low-level controller) is identified and simulated in a modified Habitat simulator. Noise models for odometry and localization further contribute in lowering the sim2real gap. We evaluate on real navigation scenarios, explore different localization and point goal calculation methods and report significant gains in performance and robustness compared to prior work.
