Learning Occlusion-aware Decision-making from Agent Interaction via Active Perception
Jie Jia, Yiming Shu, Zhongxue Gan, Wenchao Ding
TL;DR
The paper tackles occlusion-induced uncertainty in autonomous driving by combining a vectorized representation of occluded environments with high-level semantic motion primitives (SMPs) and a safety-aware RL loop. It integrates a prediction model to constrain exploration within RSS-based safety boundaries, enabling risk-aware learning with improved sample efficiency. The approach, validated in challenging dynamic and static occlusion scenarios in CARLA, outperforms strong baselines (EMP, RSA, SOAP) in success rate, speed, and collision reduction while achieving real-time planning speeds. Ablation studies confirm the effectiveness of vectorized occlusion representations, SMPs, and the safety-prediction mechanism. Overall, Pad-AI advances occlusion-aware decision-making by delivering scalable, efficient, and safer active perception for autonomous driving, with potential for real-world deployment.
Abstract
Occlusion-aware decision-making is essential in autonomous driving due to the high uncertainty of various occlusions. Recent occlusion-aware decision-making methods encounter issues such as high computational complexity, scenario scalability challenges, or reliance on limited expert data. Benefiting from automatically generating data by exploration randomization, we uncover that reinforcement learning (RL) may show promise in occlusion-aware decision-making. However, previous occlusion-aware RL faces challenges in expanding to various dynamic and static occlusion scenarios, low learning efficiency, and lack of predictive ability. To address these issues, we introduce Pad-AI, a self-reinforcing framework to learn occlusion-aware decision-making through active perception. Pad-AI utilizes vectorized representation to represent occluded environments efficiently and learns over the semantic motion primitives to focus on high-level active perception exploration. Furthermore, Pad-AI integrates prediction and RL within a unified framework to provide risk-aware learning and security guarantees. Our framework was tested in challenging scenarios under both dynamic and static occlusions and demonstrated efficient and general perception-aware exploration performance to other strong baselines in closed-loop evaluations.
