Exploring Driving Behavior for Autonomous Vehicles Based on Gramian Angular Field Vision Transformer
Junwei You, Ying Chen, Zhuoyu Jiang, Zhangchi Liu, Zilin Huang, Yifeng Ding, Bin Ran
TL;DR
The paper addresses driving-behavior classification for autonomous vehicles to aid fault diagnosis and algorithmic improvement. It proposes GAF-ViT, which converts multivariate driving sequences into multi-channel images using Gramian Angular Field representations and classifies them with a Vision Transformer augmented by a channel-attention mechanism. Key contributions include (i) visualizing driving-behavior features via GAF, (ii) transforming multivariate sequences into image-like inputs for ViT-based classification, and (iii) demonstrating performance gains and ablation-backed module necessity on the Waymo trajectory dataset. The work holds practical significance for AV safety, enabling anomaly detection and adaptive control, with deployment strategies leveraging edge or RSU infrastructure for scalable real-time analysis.
Abstract
Effective classification of autonomous vehicle (AV) driving behavior emerges as a critical area for diagnosing AV operation faults, enhancing autonomous driving algorithms, and reducing accident rates. This paper presents the Gramian Angular Field Vision Transformer (GAF-ViT) model, designed to analyze AV driving behavior. The proposed GAF-ViT model consists of three key components: GAF Transformer Module, Channel Attention Module, and Multi-Channel ViT Module. These modules collectively convert representative sequences of multivariate behavior into multi-channel images and employ image recognition techniques for behavior classification. A channel attention mechanism is applied to multi-channel images to discern the impact of various driving behavior features. Experimental evaluation on the Waymo Open Dataset of trajectories demonstrates that the proposed model achieves state-of-the-art performance. Furthermore, an ablation study effectively substantiates the efficacy of individual modules within the model.
