HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios
Mingyang Jiang, Yueyuan Li, Songan Zhang, Siyuan Chen, Chunxiang Wang, Ming Yang
TL;DR
HOPE tackles the parking path-planning problem under diverse, real-world scenarios by fusing reinforcement learning with Reeds-Shepp geometric planning through a transformer-based information fusion module. The key innovations include a hybrid policy that blends a learnable RL component with a rule-based RS policy, an action-mask mechanism for efficient and safe action pruning, and a difficulty-ranking scheme for scenario generation and evaluation. Empirical results show HOPE achieving state-of-the-art planning success rates and strong generalization across normal, complex, and extreme parking scenarios, outperforming rule-based methods and naive RL approaches, with substantial real-world validation in indoor parking. The work demonstrates that integrating classical geometric priors with modern learning-based planning yields robust, efficient, and scalable autonomous parking solutions suitable for practical deployment, with clear directions for extending to dynamic environments.
Abstract
Automated parking stands as a highly anticipated application of autonomous driving technology. However, existing path planning methodologies fall short of addressing this need due to their incapability to handle the diverse and complex parking scenarios in reality. While non-learning methods provide reliable planning results, they are vulnerable to intricate occasions, whereas learning-based ones are good at exploration but unstable in converging to feasible solutions. To leverage the strengths of both approaches, we introduce Hybrid pOlicy Path plannEr (HOPE). This novel solution integrates a reinforcement learning agent with Reeds-Shepp curves, enabling effective planning across diverse scenarios. HOPE guides the exploration of the reinforcement learning agent by applying an action mask mechanism and employs a transformer to integrate the perceived environmental information with the mask. To facilitate the training and evaluation of the proposed planner, we propose a criterion for categorizing the difficulty level of parking scenarios based on space and obstacle distribution. Experimental results demonstrate that our approach outperforms typical rule-based algorithms and traditional reinforcement learning methods, showing higher planning success rates and generalization across various scenarios. We also conduct real-world experiments to verify the practicability of HOPE. The code for our solution is openly available on https://github.com/jiamiya/HOPE.
