Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization

Xiao Zhou; Chengzhen Meng; Wenru Liu; Zengqi Peng; Ming Liu; Jun Ma

Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization

Xiao Zhou, Chengzhen Meng, Wenru Liu, Zengqi Peng, Ming Liu, Jun Ma

TL;DR

A novel integrated intention prediction and decision-making approach, which explicitly models the coupling relationship and achieves efficient computation and demonstrates superior performance over several deep reinforcement learning baselines in terms of success rate, efficiency, and safety in driving tasks.

Abstract

For autonomous driving in highly dynamic environments, it is anticipated to predict the future behaviors of surrounding vehicles (SVs) and make safe and effective decisions. However, modeling the inherent coupling effect between the prediction and decision-making modules has been a long-standing challenge, especially when there is a need to maintain appropriate computational efficiency. To tackle these problems, we propose a novel integrated intention prediction and decision-making approach, which explicitly models the coupling relationship and achieves efficient computation. Specifically, a spectrum attention net is designed to predict the intentions of SVs by capturing the trends of each frequency component over time and their interrelations. Fast computation of the intention prediction module is attained as the predicted intentions are not decoded to trajectories in the executing process. Furthermore, the proximal policy optimization (PPO) algorithm is employed to address the non-stationary problem in the framework through a modest policy update enabled by a clipping mechanism within its objective function. On the basis of these developments, the intention prediction and decision-making modules are integrated through joint learning. Experiments are conducted in representative traffic scenarios, and the results reveal that the proposed integrated framework demonstrates superior performance over several deep reinforcement learning (DRL) baselines in terms of success rate, efficiency, and safety in driving tasks.

Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization

TL;DR

Abstract

Paper Structure (13 sections, 9 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 9 equations, 3 figures, 2 tables, 1 algorithm.

INTRODUCTION
Preliminaries
Markov Decision Process
Short-Time Fourier Transform
Problem Statement
METHODOLOGY
Intention Prediction Module with Spectrum Attention Net
Decision-Making Module with PPO
Joint Learning Process
EXPERIMENTS
Experimental Setup
Performance Demonstration
CONCLUSIONS

Figures (3)

Figure 1: Overall architecture of the proposed integrated intention prediction and decision-making framework is exemplified through a four-way intersection scenario. For observable SV $i$, its segmented historical trajectory is transformed into a spectrogram by STFT first, then used to predict its intention with the spectrum attention net. The decision-making module receives the predicted intention of SVs and makes a decision based on predicted intention and direct observation. Note that the trajectory decoder is only executed in the joint training process to improve the real-time capabilities of the integrated framework.
Figure 2: Learning curves of different methods in four representative scenarios. The training curves are smoothed by the Savitzky-Golay filter.
Figure 3: Illustrations of the AV's behavior attained by the proposed integrated intention prediction and decision-making framework in four scenarios. Four representative snapshots for each scenario during the performance evaluation are presented with Straight Road ((a)-(d)), Intersection-v0 ((e)-(h)), Intersection-v1 ((i)-(l)), and Roundabout ((m)-(p)), respectively. The green car and blue cars represent the AV and SVs under normal driving conditions, respectively. The color of the trajectory reflects the speed of the vehicle, with the transition from purple to yellow indicating an increase in vehicle speed.

Theorems & Definitions (2)

Remark 1
Remark 2

Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization

TL;DR

Abstract

Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (2)