Improving Generalization in Aerial and Terrestrial Mobile Robots Control Through Delayed Policy Learning
Ricardo B. Grando, Raul Steinmetz, Victor A. Kich, Alisson H. Kolling, Pablo M. Furik, Junior C. de Jesus, Bruna V. Guterres, Daniel T. Gamarra, Rodrigo S. Guerra, Paulo L. J. Drews-Jr
TL;DR
The paper addresses generalization gaps in deep reinforcement learning for autonomous aerial and terrestrial navigation by evaluating Delayed Policy Updates (DPU) within the TD3 framework. It systematically varies the update delay parameter $\eta$ and demonstrates that higher delays can accelerate early learning and markedly improve performance in unseen environments, as evidenced by aerial unseen tasks reaching up to $98.67\%$ success with $\eta=8$ and terrestrial unseen tasks around $85\%$ with the same delay. The results support adopting larger DPUs to mitigate overfitting and promote robust continuous-control policies in diverse scenarios, using ROS/Gazebo simulations with LIDAR sensing and obstacle-rich environments. These findings offer practical guidance for designing generalizable mobile-robot controllers and motivate further exploration of DPU interactions with other regularization and augmentation methods.
Abstract
Deep Reinforcement Learning (DRL) has emerged as a promising approach to enhancing motion control and decision-making through a wide range of robotic applications. While prior research has demonstrated the efficacy of DRL algorithms in facilitating autonomous mapless navigation for aerial and terrestrial mobile robots, these methods often grapple with poor generalization when faced with unknown tasks and environments. This paper explores the impact of the Delayed Policy Updates (DPU) technique on fostering generalization to new situations, and bolstering the overall performance of agents. Our analysis of DPU in aerial and terrestrial mobile robots reveals that this technique significantly curtails the lack of generalization and accelerates the learning process for agents, enhancing their efficiency across diverse tasks and unknown scenarios.
