Skill Q-Network: Learning Adaptive Skill Ensemble for Mapless Navigation in Unknown Environments
Hyunki Seong, David Hyunchul Shim
TL;DR
This work tackles mapless navigation in unknown environments by proposing Skill Q-Network (SQN), a reinforcement learning framework that simultaneously learns a high-level skill decision and multiple low-level navigation skills without requiring predefined priors. SQN uses a perception-planning architecture with latent skill policies and a softmax-based skill decision to ensemble Q-values, trained under a burn-in R2D2 framework to handle sequential observations. A tailored reward combines exploration and reachability signals, enabling effective navigation and avoiding local minima, with results showing up to about 40% performance gains over baselines and robust zero-shot transfer to unseen, out-of-distribution environments such as non-convex obstacles and cave-like terrains. The approach demonstrates strong adaptability and robustness, suggesting practical potential for real-world deployment and pointing to future work on dynamics-level versatility and sim-to-real transfer via domain randomization.
Abstract
This paper focuses on the acquisition of mapless navigation skills within unknown environments. We introduce the Skill Q-Network (SQN), a novel reinforcement learning method featuring an adaptive skill ensemble mechanism. Unlike existing methods, our model concurrently learns a high-level skill decision process alongside multiple low-level navigation skills, all without the need for prior knowledge. Leveraging a tailored reward function for mapless navigation, the SQN is capable of learning adaptive maneuvers that incorporate both exploration and goal-directed skills, enabling effective navigation in new environments. Our experiments demonstrate that our SQN can effectively navigate complex environments, exhibiting a 40\% higher performance compared to baseline models. Without explicit guidance, SQN discovers how to combine low-level skill policies, showcasing both goal-directed navigations to reach destinations and exploration maneuvers to escape from local minimum regions in challenging scenarios. Remarkably, our adaptive skill ensemble method enables zero-shot transfer to out-of-distribution domains, characterized by unseen observations from non-convex obstacles or uneven, subterranean-like environments. The project page is available at https://sites.google.com/view/skill-q-net.
