Table of Contents
Fetching ...

RiskProp: Collision-Anchored Self-Supervised Risk Propagation for Early Accident Anticipation

Yiyang Zou, Tianhao Zhao, Peilun Xiao, Hongyu Jin, Longyu Qi, Yuxuan Li, Liyin Liang, Yifeng Qian, Chunbo Lai, Yutian Lin, Zhihui Li, Yu Wu

Abstract

Accident anticipation aims to predict impending collisions from dashcam videos and trigger early alerts. Existing methods rely on binary supervision with manually annotated "anomaly onset" frames, which are subjective and inconsistent, leading to inaccurate risk estimation. In contrast, we propose RiskProp, a novel collision-anchored self-supervised risk propagation paradigm for early accident anticipation, which removes the need for anomaly onset annotations and leverages only the reliably annotated collision frame. RiskProp models temporal risk evolution through two observation-driven losses: first, since future frames contain more definitive evidence of an impending accident, we introduce a future-frame regularization loss that uses the model's next-frame prediction as a soft target to supervise the current frame, enabling backward propagation of risk signals; second, inspired by the empirical trend of rising risk before accidents, we design an adaptive monotonic constraint to encourage a non-decreasing progression over time. Experiments on CAP and Nexar demonstrate that RiskProp achieves state-of-the-art performance and produces smoother, more discriminative risk curves, improving both early anticipation and interpretability.

RiskProp: Collision-Anchored Self-Supervised Risk Propagation for Early Accident Anticipation

Abstract

Accident anticipation aims to predict impending collisions from dashcam videos and trigger early alerts. Existing methods rely on binary supervision with manually annotated "anomaly onset" frames, which are subjective and inconsistent, leading to inaccurate risk estimation. In contrast, we propose RiskProp, a novel collision-anchored self-supervised risk propagation paradigm for early accident anticipation, which removes the need for anomaly onset annotations and leverages only the reliably annotated collision frame. RiskProp models temporal risk evolution through two observation-driven losses: first, since future frames contain more definitive evidence of an impending accident, we introduce a future-frame regularization loss that uses the model's next-frame prediction as a soft target to supervise the current frame, enabling backward propagation of risk signals; second, inspired by the empirical trend of rising risk before accidents, we design an adaptive monotonic constraint to encourage a non-decreasing progression over time. Experiments on CAP and Nexar demonstrate that RiskProp achieves state-of-the-art performance and produces smoother, more discriminative risk curves, improving both early anticipation and interpretability.

Paper Structure

This paper contains 16 sections, 9 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Form of supervision of previous works and ours. (a) Most previous methods treat all frames in accident-free videos as negative samples with label 0, and treat frames in accident videos—from the annotated anomaly onset frame to the collision frame—as positive samples with labeled 1. (b) Our method treats the model’s prediction for the next frame as a supervision signal for the current frame, which enables the risk values to propagate gradually backward from the collision frame.
  • Figure 2: Overview of our RiskProp framework. The encoder-only model takes a snippet of consecutive frames and predicts the current frame's risk score. To train such a model without "anomaly onset" labels and with only the objective collision frame label, two losses are proposed: The Future-Frame Regularization Loss uses the next-frame's detached prediction as a self-supervised target for high-risk signals backward propagation, while the Adaptive Monotonic Constraint Loss imposes a monotonicity constraint to ensure a non-decreasing overall risk trend. And we adopt the Binary Cross-Entropy loss to provide explicit supervision only at the collision frame. This enables stable, physically plausible risk curves without manual onset annotations.
  • Figure 3: Illustration of sampling strategy for the adaptive monotonic constraint loss. For a randomly selected $d \in [d_{min}, d_{max}]$, a starting frame $t_1$ is sampled from $[0, T(1 - d)]$, and $t_2 = t_1 + dT$. This encourages learning across various temporal distances.
  • Figure 4: Frame- and Dataset-Level Risk Prediction. Left: On a representative accident video, supervised baselines produce early false peaks before clear risk emerges. Our RiskProp model prediction remains low during safe periods and rises sharply only when discriminative cues appear, yielding a temporally coherent risk curve. Right: Dataset-wide average risk curves on CAP and Nexar, aligned to accident timestamps. RiskProp suppresses early false positives and delivers a steeper, better-calibrated rise near the event, enabling earlier, more reliable warnings.