Table of Contents
Fetching ...

Uncertainty-Aware Vision-based Risk Object Identification via Conformal Risk Tube Prediction

Kai-Yu Fu, Yi-Ting Chen

Abstract

We study object importance-based vision risk object identification (Vision-ROI), a key capability for hazard detection in intelligent driving systems. Existing approaches make deterministic decisions and ignore uncertainty, which could lead to safety-critical failures. Specifically, in ambiguous scenarios, fixed decision thresholds may cause premature or delayed risk detection and temporally unstable predictions, especially in complex scenes with multiple interacting risks. Despite these challenges, current methods lack a principled framework to model risk uncertainty jointly across space and time. We propose Conformal Risk Tube Prediction, a unified formulation that captures spatiotemporal risk uncertainty, provides coverage guarantees for true risks, and produces calibrated risk scores with uncertainty estimates. To conduct a systematic evaluation, we present a new dataset and metrics probing diverse scenario configurations with multi-risk coupling effects, which are not supported by existing datasets. We systematically analyze factors affecting uncertainty estimation, including scenario variations, per-risk category behavior, and perception error propagation. Our method delivers substantial improvements over prior approaches, enhancing vision-ROI robustness and downstream performance, such as reducing nuisance braking alerts. For more qualitative results, please visit our project webpage: https://hcis-lab.github.io/CRTP/

Uncertainty-Aware Vision-based Risk Object Identification via Conformal Risk Tube Prediction

Abstract

We study object importance-based vision risk object identification (Vision-ROI), a key capability for hazard detection in intelligent driving systems. Existing approaches make deterministic decisions and ignore uncertainty, which could lead to safety-critical failures. Specifically, in ambiguous scenarios, fixed decision thresholds may cause premature or delayed risk detection and temporally unstable predictions, especially in complex scenes with multiple interacting risks. Despite these challenges, current methods lack a principled framework to model risk uncertainty jointly across space and time. We propose Conformal Risk Tube Prediction, a unified formulation that captures spatiotemporal risk uncertainty, provides coverage guarantees for true risks, and produces calibrated risk scores with uncertainty estimates. To conduct a systematic evaluation, we present a new dataset and metrics probing diverse scenario configurations with multi-risk coupling effects, which are not supported by existing datasets. We systematically analyze factors affecting uncertainty estimation, including scenario variations, per-risk category behavior, and perception error propagation. Our method delivers substantial improvements over prior approaches, enhancing vision-ROI robustness and downstream performance, such as reducing nuisance braking alerts. For more qualitative results, please visit our project webpage: https://hcis-lab.github.io/CRTP/
Paper Structure (19 sections, 1 equation, 5 figures, 6 tables)

This paper contains 19 sections, 1 equation, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Risk Tube Prediction. Our formulation models risk uncertainty jointly across space and time by representing potential hazards as spatiotemporal risk tubes. In this example, the green-boxed truck (#2) moves forward and may occlude part of the scene, creating the possibility of a hidden object (#3) emerging from the occluded region. Risk tubes illustrate how potential hazards evolve over time, while uncertainty is visualized through semi-transparent shading that gradually becomes opaque as observations reduce ambiguity and confidence increases.
  • Figure 2: The presence of multiple risks complicatedly reshapes object-ego interactions in both space and time.
  • Figure 3: The scenario taxonomy specifies attributes including road topology, risk trigger location, risk category, object type, and behavior. Given a scenario configuration, we script hazard behaviors, control traffic conditions, and further augment the scenario by varying traffic density in CARLA.
  • Figure 4: Overview of the Conformal Risk Tube Prediction Framework. Given front-view images, the model performs spatiotemporal relation modeling and predicts each object’s future risk interval. Then, based on a prediction of an object’s risk category, the corresponding conformal calibrator is applied to calibrate its risk scores over the future interval. The calibrated risk tube provides a more precise temporal bound to fully cover the true risk interval of each hazardous object.
  • Figure 5: Visualization of ROI results before and after calibration on a sampled scenario. All detected risk objects are shown with green bounding boxes, while ground truth risks are in red.