Table of Contents
Fetching ...

How Safe Will I Be Given What I Saw? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy

Zhenjiang Mao, Mrinall Eashaan Umasudhan, Ivan Ruchkin

TL;DR

The paper tackles calibrated safety prediction for image-controlled autonomous systems under partial observability and distribution shift by introducing a modular framework that combines world-model latent forecasting, monolithic and composite predictors, unsupervised domain adaptation (MEMO), and conformal calibration. It demonstrates that decomposing the problem into latent representation, trajectory forecasting, and safety evaluation provides strong long-horizon performance, particularly when equipped with attention-based forecasters and latent evaluators. A key contribution is the development of post-hoc conformal calibration and adaptive binning to produce reliable probability intervals for safety predictions, with theoretical guarantees. Experimentally, the approach is validated on racing-car, cart-pole, and Donkey Car benchmarks, showing improved F1, reduced false positives, robust performance under distribution shift, and reliable calibration bounds across horizons.

Abstract

Autonomous robots that rely on deep neural network controllers pose critical challenges for safety prediction, especially under partial observability and distribution shift. Traditional model-based verification techniques are limited in scalability and require access to low-dimensional state models, while model-free methods often lack reliability guarantees. This paper addresses these limitations by introducing a framework for calibrated safety prediction in end-to-end vision-controlled systems, where neither the state-transition model nor the observation model is accessible. Building on the foundation of world models, we leverage variational autoencoders and recurrent predictors to forecast future latent trajectories from raw image sequences and estimate the probability of satisfying safety properties. We distinguish between monolithic and composite prediction pipelines and introduce a calibration mechanism to quantify prediction confidence. In long-horizon predictions from high-dimensional observations, the forecasted inputs to the safety evaluator can deviate significantly from the training distribution due to compounding prediction errors and changing environmental conditions, leading to miscalibrated risk estimates. To address this, we incorporate unsupervised domain adaptation to ensure robustness of safety evaluation under distribution shift in predictions without requiring manual labels. Our formulation provides theoretical calibration guarantees and supports practical evaluation across long prediction horizons. Experimental results on three benchmarks show that our UDA-equipped evaluators maintain high accuracy and substantially lower false positive rates under distribution shift. Similarly, world model-based composite predictors outperform their monolithic counterparts on long-horizon tasks, and our conformal calibration provides reliable statistical bounds.

How Safe Will I Be Given What I Saw? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy

TL;DR

The paper tackles calibrated safety prediction for image-controlled autonomous systems under partial observability and distribution shift by introducing a modular framework that combines world-model latent forecasting, monolithic and composite predictors, unsupervised domain adaptation (MEMO), and conformal calibration. It demonstrates that decomposing the problem into latent representation, trajectory forecasting, and safety evaluation provides strong long-horizon performance, particularly when equipped with attention-based forecasters and latent evaluators. A key contribution is the development of post-hoc conformal calibration and adaptive binning to produce reliable probability intervals for safety predictions, with theoretical guarantees. Experimentally, the approach is validated on racing-car, cart-pole, and Donkey Car benchmarks, showing improved F1, reduced false positives, robust performance under distribution shift, and reliable calibration bounds across horizons.

Abstract

Autonomous robots that rely on deep neural network controllers pose critical challenges for safety prediction, especially under partial observability and distribution shift. Traditional model-based verification techniques are limited in scalability and require access to low-dimensional state models, while model-free methods often lack reliability guarantees. This paper addresses these limitations by introducing a framework for calibrated safety prediction in end-to-end vision-controlled systems, where neither the state-transition model nor the observation model is accessible. Building on the foundation of world models, we leverage variational autoencoders and recurrent predictors to forecast future latent trajectories from raw image sequences and estimate the probability of satisfying safety properties. We distinguish between monolithic and composite prediction pipelines and introduce a calibration mechanism to quantify prediction confidence. In long-horizon predictions from high-dimensional observations, the forecasted inputs to the safety evaluator can deviate significantly from the training distribution due to compounding prediction errors and changing environmental conditions, leading to miscalibrated risk estimates. To address this, we incorporate unsupervised domain adaptation to ensure robustness of safety evaluation under distribution shift in predictions without requiring manual labels. Our formulation provides theoretical calibration guarantees and supports practical evaluation across long prediction horizons. Experimental results on three benchmarks show that our UDA-equipped evaluators maintain high accuracy and substantially lower false positive rates under distribution shift. Similarly, world model-based composite predictors outperform their monolithic counterparts on long-horizon tasks, and our conformal calibration provides reliable statistical bounds.

Paper Structure

This paper contains 29 sections, 1 theorem, 21 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Given a dataset bin $B = \{b_{k}\}_{k=1}^{K}$ of i.i.d. observation–state pairs $b_{k} = (y_{k}, x_{k})$, we obtain a collection of datasets $\{B_{j}\}_{j=1}^{M}$ by drawing $M$ datasets of $N$ i.i.d. samples from $B$, leading to datasets $B_{j}$ to be drawn i.i.d. from a dataset distribution $\math leads to prediction intervals with guaranteed containment: where $q(B_{M+1})$ is the mean safety c

Figures (6)

  • Figure 1: Safety prediction pipelines for image-controlled autonomous systems. The blue arrows represent the underlying dynamics loop, consisting of the unknown dynamical model, unknown observation model, and known controller. The orange arrows indicate the learned world model loop, where future observations are simulated based on past observations and actions. The green path illustrates the monolithic prediction pipeline, which directly predicts safety outcomes (safe/unsafe with probability $p$) from observations. The purple path shows the composite prediction pipeline, where a forecaster predicts future observations that are passed to a safety evaluator.
  • Figure 2: Comparison of safety prediction pipelines and their associated world model structures (gray boxes). Top: The monolithic predictor maps image sequences directly to safety probabilities $P(\varphi(x))$, followed by a calibrator for improved confidence. Middle: The latent monolithic predictor encodes images into latent states and predicts safety from these, with calibration applied afterward. Bottom: The composite latent predictor encodes inputs, forecasts future latent states, and uses an evaluator with UDA to estimate safety. All pipelines predict the probability of satisfying the safety property $\varphi(x)$.
  • Figure 3: From top to bottom are the observations of our three cases (top to bottom: cart pole, racing car, and donkey car). The columns (left to right) are safe observation from $\mathbf Y\xspace$, unsafe observation from $\mathbf Y\xspace$, safe observation from $d\xspace(f_{l}\xspace(e\xspace(\mathbf Y\xspace)))$, unsafe observation from $f_{g}\xspace(\mathbf Y\xspace)$ under distribution shift.
  • Figure 4: F1 score performance of safety label predictors over varied horizons. Upper to Lower: (1) cart pole; (2) car racing; (3) donkey car. Left pairs show the comparison between monolithic (mon.) models and composite (comp.) models, while right pairs show the comparison between attention-based models and non-attention-based models.
  • Figure 5: Calibration of a monolithic CNN predictor for the racing car with horizon $k=100$. Left: uncalibrated, right: calibrated via isotonic regression and conformal bounds for $\alpha=0.05$.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Definition 1: Robotic system
  • Definition 2: Safety label predictor
  • Definition 3: Safety chance predictor
  • Definition 4: Safety chance interval predictor
  • Definition 5: Observation-action dataset
  • Definition 6: Monolithic latent predictor
  • Definition 7: Composite image predictor
  • Definition 8: Composite latent predictor
  • Definition 9: Adaptive binning
  • Theorem 1: Conformal Bounds on Confidence Scores (adaptation of Theorem 2.1 in lei2018distribution)