RL-OGM-Parking: Lidar OGM-Based Hybrid Reinforcement Learning Planner for Autonomous Parking
Zhitao Wang, Zhe Chen, Mingyang Jiang, Tong Qin, Ming Yang
TL;DR
This work tackles the sim-to-real gap in RL-based autonomous parking by introducing an Occupancy Grid Map ($OGM$)–driven hybrid planner that fuses a rule-based Reeds-Shepp path planner with a learning-based planner. Perception is standardized through LiDAR-generated $OGM$s, enabling consistent inputs for both training and real-time inference and improving transferability to real-world environments. The planner combines an RS baseline for fast, safe trajectories with a Soft Actor-Critic (SAC)–based RL component that explores and refines maneuvers, aided by an action mask to prevent collisions. Across simulation and real-world experiments, the approach outperforms Hybrid A*, SAC, and PPO baselines in Parking Success Rate and maneuver efficiency, validating its practicality for real autonomous parking systems. The results suggest that $OGM$–based perception and hybrid planning offer a viable path toward robust, generalizable autonomous parking with real-time performance.
Abstract
Autonomous parking has become a critical application in automatic driving research and development. Parking operations often suffer from limited space and complex environments, requiring accurate perception and precise maneuvering. Traditional rule-based parking algorithms struggle to adapt to diverse and unpredictable conditions, while learning-based algorithms lack consistent and stable performance in various scenarios. Therefore, a hybrid approach is necessary that combines the stability of rule-based methods and the generalizability of learning-based methods. Recently, reinforcement learning (RL) based policy has shown robust capability in planning tasks. However, the simulation-to-reality (sim-to-real) transfer gap seriously blocks the real-world deployment. To address these problems, we employ a hybrid policy, consisting of a rule-based Reeds-Shepp (RS) planner and a learning-based reinforcement learning (RL) planner. A real-time LiDAR-based Occupancy Grid Map (OGM) representation is adopted to bridge the sim-to-real gap, leading the hybrid policy can be applied to real-world systems seamlessly. We conducted extensive experiments both in the simulation environment and real-world scenarios, and the result demonstrates that the proposed method outperforms pure rule-based and learning-based methods. The real-world experiment further validates the feasibility and efficiency of the proposed method.
