ICPR 2024 Competition on Rider Intention Prediction
Shankar Gangisetty, Abdul Wasi, Shyam Nandan Rai, C. V. Jawahar, Sajay Raj, Manish Prajapati, Ayesha Choudhary, Aaryadev Chandra, Dev Chandan, Shireen Chand, Suvaditya Mukherjee
TL;DR
The paper introduces the RAAD dataset and ICPR 2024 Rider Intention Prediction competition to anticipate two-wheeler maneuvers before execution, with two tasks: frontal-view RIP and multi-view RIP. It benchmarks a state-space Mamba2 model, an SVM-based approach with SMOTE, and a CNN-LSTM model, finding that the Mamba2 SSM achieves the best overall performance on both accuracy and F1 across the RAAD benchmark. RAAD collects 1,000 multi-view video samples across six maneuvers from three camera views, with embeddings from $VGG$-16, $ResNet$-50, and $R(2+1)D$, revealing challenges from class imbalance and limited multi-view gains. The results indicate the promise of high-dimensional temporal-state models for RIP and highlight the need for architecture innovations to exploit multi-view data, as well as future work on longer-term rider-intention data and architecture improvements for multi-view RIP. These contributions deliver a practical, real-world dataset and competitive baselines to advance proactive rider safety in unstructured traffic contexts.
Abstract
The recent surge in the vehicle market has led to an alarming increase in road accidents. This underscores the critical importance of enhancing road safety measures, particularly for vulnerable road users like motorcyclists. Hence, we introduce the rider intention prediction (RIP) competition that aims to address challenges in rider safety by proactively predicting maneuvers before they occur, thereby strengthening rider safety. This capability enables the riders to react to the potential incorrect maneuvers flagged by advanced driver assistance systems (ADAS). We collect a new dataset, namely, rider action anticipation dataset (RAAD) for the competition consisting of two tasks: single-view RIP and multi-view RIP. The dataset incorporates a spectrum of traffic conditions and challenging navigational maneuvers on roads with varying lighting conditions. For the competition, we received seventy-five registrations and five team submissions for inference of which we compared the methods of the top three performing teams on both the RIP tasks: one state-space model (Mamba2) and two learning-based approaches (SVM and CNN-LSTM). The results indicate that the state-space model outperformed the other methods across the entire dataset, providing a balanced performance across maneuver classes. The SVM-based RIP method showed the second-best performance when using random sampling and SMOTE. However, the CNN-LSTM method underperformed, primarily due to class imbalance issues, particularly struggling with minority classes. This paper details the proposed RAAD dataset and provides a summary of the submissions for the RIP 2024 competition.
