Table of Contents
Fetching ...

Hybrid-Prediction Integrated Planning for Autonomous Driving

Haochen Liu, Zhiyu Huang, Wenhui Huang, Haohan Yang, Xiaoyu Mo, Chen Lv

TL;DR

The paper tackles the fragmentation between prediction and planning in autonomous driving by proposing Hybrid-Prediction integrated Planning (HPP), a modular co-design framework that fuses IPP and IOP through three novel components: MS-OccFormer for marginal-conditioned occupancy, GTFormer for game-theoretic reasoning among agents, and an interactive Ego Planner. By encoding BEV scene context, aligning multi-scale occupancy with agent-wise motion, and iteratively reasoning across joint and marginal predictions, HPP achieves state-of-the-art results on nuScenes and strong long-horizon performance on WOMD and CARLA, outperforming both traditional and end-to-end baselines. Key contributions include the marginal-conditioned occupancy formulation, multi-scale prediction-wise integration, level-K game-theoretic Transformer reasoning, and a differentiable optimization pipeline that jointly refines planning with hybrid predictions. The approach demonstrates improved accuracy, safety, and social coherence in end-to-end ADS, highlighting the practical value of modular co-design for robust autonomous driving systems.

Abstract

Autonomous driving systems require the ability to fully understand and predict the surrounding environment to make informed decisions in complex scenarios. Recent advancements in learning-based systems have highlighted the importance of integrating prediction and planning modules. However, this integration has brought forth three major challenges: inherent trade-offs by sole prediction, consistency between prediction patterns, and social coherence in prediction and planning. To address these challenges, we introduce a hybrid-prediction integrated planning (HPP) system, which possesses three novelly designed modules. First, we introduce marginal-conditioned occupancy prediction to align joint occupancy with agent-wise perceptions. Our proposed MS-OccFormer module achieves multi-stage alignment per occupancy forecasting with consistent awareness from agent-wise motion predictions. Second, we propose a game-theoretic motion predictor, GTFormer, to model the interactive future among individual agents with their joint predictive awareness. Third, hybrid prediction patterns are concurrently integrated with Ego Planner and optimized by prediction guidance. HPP achieves state-of-the-art performance on the nuScenes dataset, demonstrating superior accuracy and consistency for end-to-end paradigms in prediction and planning. Moreover, we test the long-term open-loop and closed-loop performance of HPP on the Waymo Open Motion Dataset and CARLA benchmark, surpassing other integrated prediction and planning pipelines with enhanced accuracy and compatibility.

Hybrid-Prediction Integrated Planning for Autonomous Driving

TL;DR

The paper tackles the fragmentation between prediction and planning in autonomous driving by proposing Hybrid-Prediction integrated Planning (HPP), a modular co-design framework that fuses IPP and IOP through three novel components: MS-OccFormer for marginal-conditioned occupancy, GTFormer for game-theoretic reasoning among agents, and an interactive Ego Planner. By encoding BEV scene context, aligning multi-scale occupancy with agent-wise motion, and iteratively reasoning across joint and marginal predictions, HPP achieves state-of-the-art results on nuScenes and strong long-horizon performance on WOMD and CARLA, outperforming both traditional and end-to-end baselines. Key contributions include the marginal-conditioned occupancy formulation, multi-scale prediction-wise integration, level-K game-theoretic Transformer reasoning, and a differentiable optimization pipeline that jointly refines planning with hybrid predictions. The approach demonstrates improved accuracy, safety, and social coherence in end-to-end ADS, highlighting the practical value of modular co-design for robust autonomous driving systems.

Abstract

Autonomous driving systems require the ability to fully understand and predict the surrounding environment to make informed decisions in complex scenarios. Recent advancements in learning-based systems have highlighted the importance of integrating prediction and planning modules. However, this integration has brought forth three major challenges: inherent trade-offs by sole prediction, consistency between prediction patterns, and social coherence in prediction and planning. To address these challenges, we introduce a hybrid-prediction integrated planning (HPP) system, which possesses three novelly designed modules. First, we introduce marginal-conditioned occupancy prediction to align joint occupancy with agent-wise perceptions. Our proposed MS-OccFormer module achieves multi-stage alignment per occupancy forecasting with consistent awareness from agent-wise motion predictions. Second, we propose a game-theoretic motion predictor, GTFormer, to model the interactive future among individual agents with their joint predictive awareness. Third, hybrid prediction patterns are concurrently integrated with Ego Planner and optimized by prediction guidance. HPP achieves state-of-the-art performance on the nuScenes dataset, demonstrating superior accuracy and consistency for end-to-end paradigms in prediction and planning. Moreover, we test the long-term open-loop and closed-loop performance of HPP on the Waymo Open Motion Dataset and CARLA benchmark, surpassing other integrated prediction and planning pipelines with enhanced accuracy and compatibility.
Paper Structure (41 sections, 16 equations, 8 figures, 12 tables)

This paper contains 41 sections, 16 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Generic learning-based pipelines in autonomous driving. a) Integrated pipelines learn planning jointly with motion prediction (IPP) or occupancy prediction (IOP) with coupled networks. b) End-to-end pipelines directly map from raw sensor inputs to planning. c) Our proposed HPP establishes a coupled planning system with hybrid-prediction integration and optimization.
  • Figure 2: Systematic overview of the proposed Hybrid-Prediction integrated Planning (HPP) framework. HPP is established upon query-based co-design optimization of interactive planning with hybrid prediction integration (IPP and IOP), informed by BEV perceptions. With encoded perception scene context $Q_{Map}, Q_B, Q_A$, HPP provides prediction and planning co-design in three-fold. Joint occupancy prediction $\hat{\textbf{O}}$ is iteratively refined in MS-OccFormer, sharing mutual consistency over marginal motion prediction $\hat{\textbf{Y}}$ in GTFormer. GTFormer performs interactive reasoning between marginal prediction and planning. Reasoned outcomes and ego features are then served to query hybrid prediction-aware planning $\tau$ in Ego Planner. Eventually, optimizations are scheduled to refine planning $\tau^*$ with hybrid-prediction guidance.
  • Figure 3: Generic learning framework in MS-OccFormer: a) A single block of multi-scale marginal-conditioned occupancy predictor. Joint occupancy $\hat{\mathbf{O}}$ is consistently integrated with marginal prediction features $\mathbf{H}^A_{traj}$ through global interactions and local refinements, and guided by iteratively updated learnable attention mask $\mathbf{M}$; b) A Swin-T decoder for local interactions through shifted-window cross attention; c) Agent-wise fusion for marginal prediction features.
  • Figure 4: a) A single-step reasoning layer of GTFormer. Level-$K$ reasoning queries interactive behaviors hierarchically for all agents $Q^A_M$ in predictions and planning. Meanwhile, it considers interactions with scene context $Q_A, Q_{Map}$ and joint predictive BEV features $\mathbf{H}_{occ}$; b) Occupancy fusion for joint prediction features; c) hybrid-prediction aware Ego Planner conditioned on plan context.
  • Figure 5: Open-loop predictions and planning results for IPP baselines in WOMD. HPP presents the best short-term performance with lower variance.
  • ...and 3 more figures