Table of Contents
Fetching ...

Real-time Traffic Accident Anticipation with Feature Reuse

Inpyo Song, Jangwon Lee

TL;DR

This work tackles real-time traffic accident anticipation by addressing the latency of heavy feature-extraction pipelines. It introduces RARE, a lightweight framework that reuses intermediate embeddings from a single pre-trained object detector, and adds an Attention Score Ranking Loss to explicitly push attention toward accident-related objects. The method combines detector-derived object and scene embeddings with a scene-object attention module and a short-term memory for temporal context, optimized with a total loss $L = L_{\mathrm{AdaLEA}} + \gamma L_R$. On DAD and CCD benchmarks, RARE achieves real-time latency around 13.6 ms per frame (73.3 FPS) while maintaining state-of-the-art AP, demonstrating practical viability for safety-critical autonomous driving systems.

Abstract

This paper addresses the problem of anticipating traffic accidents, which aims to forecast potential accidents before they happen. Real-time anticipation is crucial for safe autonomous driving, yet most methods rely on computationally heavy modules like optical flow and intermediate feature extractors, making real-world deployment challenging. In this paper, we thus introduce RARE (Real-time Accident anticipation with Reused Embeddings), a lightweight framework that capitalizes on intermediate features from a single pre-trained object detector. By eliminating additional feature-extraction pipelines, RARE significantly reduces latency. Furthermore, we introduce a novel Attention Score Ranking Loss, which prioritizes higher attention on accident-related objects over non-relevant ones. This loss enhances both accuracy and interpretability. RARE demonstrates a 4-8 times speedup over existing approaches on the DAD and CCD benchmarks, achieving a latency of 13.6ms per frame (73.3 FPS) on an RTX 6000. Moreover, despite its reduced complexity, it attains state-of-the-art Average Precision and reliably anticipates imminent collisions in real time. These results highlight RARE's potential for safety-critical applications where timely and explainable anticipation is essential.

Real-time Traffic Accident Anticipation with Feature Reuse

TL;DR

This work tackles real-time traffic accident anticipation by addressing the latency of heavy feature-extraction pipelines. It introduces RARE, a lightweight framework that reuses intermediate embeddings from a single pre-trained object detector, and adds an Attention Score Ranking Loss to explicitly push attention toward accident-related objects. The method combines detector-derived object and scene embeddings with a scene-object attention module and a short-term memory for temporal context, optimized with a total loss . On DAD and CCD benchmarks, RARE achieves real-time latency around 13.6 ms per frame (73.3 FPS) while maintaining state-of-the-art AP, demonstrating practical viability for safety-critical autonomous driving systems.

Abstract

This paper addresses the problem of anticipating traffic accidents, which aims to forecast potential accidents before they happen. Real-time anticipation is crucial for safe autonomous driving, yet most methods rely on computationally heavy modules like optical flow and intermediate feature extractors, making real-world deployment challenging. In this paper, we thus introduce RARE (Real-time Accident anticipation with Reused Embeddings), a lightweight framework that capitalizes on intermediate features from a single pre-trained object detector. By eliminating additional feature-extraction pipelines, RARE significantly reduces latency. Furthermore, we introduce a novel Attention Score Ranking Loss, which prioritizes higher attention on accident-related objects over non-relevant ones. This loss enhances both accuracy and interpretability. RARE demonstrates a 4-8 times speedup over existing approaches on the DAD and CCD benchmarks, achieving a latency of 13.6ms per frame (73.3 FPS) on an RTX 6000. Moreover, despite its reduced complexity, it attains state-of-the-art Average Precision and reliably anticipates imminent collisions in real time. These results highlight RARE's potential for safety-critical applications where timely and explainable anticipation is essential.

Paper Structure

This paper contains 18 sections, 8 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Performance comparison of traffic accident anticipation methods. It shows FPS (x-axis) vs. Average Precision (y-axis) on the DAD (a) and CCD (b) datasets. RARE is the only method that achieves real-time performance (73.3 FPS), exceeding the 30 FPS threshold while maintaining the highest AP on both datasets.
  • Figure 2: Overview of our proposed RARE framework. RARE detects objects in each input frame and extracts intermediate features using a pre-trained object detector. Object-specific embeddings $F_o$ are computed via RoI Align on multi-scale features, while temporal scene-level dynamics are encoded by a GRU over the detector’s backbone features. Attention module fuses these scene-level and object-level representations, and the fused features are passed to a classifier for predicting accident risk.
  • Figure 3: Qualitative results of RARE on DAD dataset. The top row shows detected bounding boxes in green, with the highest attention-weighted in red and the second in orange. The bottom row presents the predicted risk score in blue, with the vertical red dotted line marking the accident timing and the horizontal black dotted line indicating the threshold.