Accident Impact Prediction based on a deep convolutional and recurrent neural network model
Pouyan Sajadi, Mahya Qorbani, Sobhan Moosavi, Erfan Hassannayebi
TL;DR
This work tackles real-time post-accident impact forecasting by introducing a two-stage LSTM-CNN cascade that first detects potential accidents and then predicts their impact using a novel gamma metric derived from delay signals. It leverages four publicly accessible Los Angeles County datasets (accidents, congestion, POI, and weather) and a meticulous preprocessing/augmentation pipeline to build a spatiotemporal input representation $l(s,t) \in \mathbb{R}^{26}$. The key contribution is the gamma-based labeling plus a cascaded architecture that yields high precision for non-accident intervals and strong recall for high-impact events, outperforming baseline models including RF, GBC, CNN, and LSTM. The framework is designed for real-time deployment and generalizes beyond small regions by encoding region-specific embeddings and exploiting imbalanced-data handling via undersampling and class weighting. This approach has practical implications for traffic management and safety interventions in urban environments, with clear paths for future enhancements such as broader regional deployment and richer data sources.
Abstract
Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of real-time forecasting of post-accident impact using readily available data can play a crucial role in preventing adverse outcomes and enhancing overall safety. However, existing accident predictive models encounter two main challenges: first, reliance on either costly or non-real-time data, and second the absence of a comprehensive metric to measure post-accident impact accurately. To address these limitations, this study proposes a deep neural network model known as the cascade model. It leverages readily available real-world data from Los Angeles County to predict post-accident impacts. The model consists of two components: Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). The LSTM model captures temporal patterns, while the CNN extracts patterns from the sparse accident dataset. Furthermore, an external traffic congestion dataset is incorporated to derive a new feature called the "accident impact" factor, which quantifies the influence of an accident on surrounding traffic flow. Extensive experiments were conducted to demonstrate the effectiveness of the proposed hybrid machine learning method in predicting the post-accident impact compared to state-of-the-art baselines. The results reveal a higher precision in predicting minimal impacts (i.e., cases with no reported accidents) and a higher recall in predicting more significant impacts (i.e., cases with reported accidents).
