MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Shanze Wang, Mingao Tan, Zhibo Yang, Biao Huang, Xiaoyu Shen, Hailong Huang, Wei Zhang
TL;DR
MAER-Nav tackles the action-flexibility gap in DRL-based robot navigation by introducing Mirror-Augmented Experience Replay, which generates synthetic backward experiences from successful forward trajectories. Coupled with bidirectional action recovery and curriculum learning, MAER-Nav enables robust bidirectional motion learning using a dual-buffer replay system and a SAC-based network with a LiDAR-focused input transform. Extensive simulation and real-world experiments show MAER-Nav outperforms state-of-the-art DRL methods and a raw MAER baseline, achieving higher success rates, lower collisions, and improved backward maneuvering in confined and dynamic environments. The approach preserves forward navigation performance while expanding maneuverability, offering a practical path toward more versatile autonomous navigation without additional sensors or reward engineering.
Abstract
Deep Reinforcement Learning (DRL) based navigation methods have demonstrated promising results for mobile robots, but suffer from limited action flexibility in confined spaces. Conventional DRL approaches predominantly learn forward-motion policies, causing robots to become trapped in complex environments where backward maneuvers are necessary for recovery. This paper presents MAER-Nav (Mirror-Augmented Experience Replay for Robot Navigation), a novel framework that enables bidirectional motion learning without requiring explicit failure-driven hindsight experience replay or reward function modifications. Our approach integrates a mirror-augmented experience replay mechanism with curriculum learning to generate synthetic backward navigation experiences from successful trajectories. Experimental results in both simulation and real-world environments demonstrate that MAER-Nav significantly outperforms state-of-the-art methods while maintaining strong forward navigation capabilities. The framework effectively bridges the gap between the comprehensive action space utilization of traditional planning methods and the environmental adaptability of learning-based approaches, enabling robust navigation in scenarios where conventional DRL methods consistently fail.
