Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

Jordan Lekeufack; Anastasios N. Angelopoulos; Andrea Bajcsy; Michael I. Jordan; Jitendra Malik

Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

Jordan Lekeufack, Anastasios N. Angelopoulos, Andrea Bajcsy, Michael I. Jordan, Jitendra Malik

TL;DR

Conformal Decision Theory (CDT) directly calibrates autonomous decisions to achieve low long-run risk under imperfect predictions, without constructing prediction sets. By maintaining a conformal control variable $\lambda_t$ that trades conservatism for performance and updating it online, CDT guarantees finite-time, distribution-free risk bounds in adversarial settings. The paper proves key risk-bounding results for a conformal controller and demonstrates effectiveness across robot navigation, manufacturing, and stock trading, highlighting improved efficiency while preserving safety relative to baseline methods. This framework broadens conformal prediction from uncertainty quantification to decision-level guarantees, with potential impact in control, reinforcement learning, and logistics where decision quality matters most.

Abstract

We introduce Conformal Decision Theory, a framework for producing safe autonomous decisions despite imperfect machine learning predictions. Examples of such decisions are ubiquitous, from robot planning algorithms that rely on pedestrian predictions, to calibrating autonomous manufacturing to exhibit high throughput and low error, to the choice of trusting a nominal policy versus switching to a safe backup policy at run-time. The decisions produced by our algorithms are safe in the sense that they come with provable statistical guarantees of having low risk without any assumptions on the world model whatsoever; the observations need not be I.I.D. and can even be adversarial. The theory extends results from conformal prediction to calibrate decisions directly, without requiring the construction of prediction sets. Experiments demonstrate the utility of our approach in robot motion planning around humans, automated stock trading, and robot manufacturing.

Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

TL;DR

that trades conservatism for performance and updating it online, CDT guarantees finite-time, distribution-free risk bounds in adversarial settings. The paper proves key risk-bounding results for a conformal controller and demonstrates effectiveness across robot navigation, manufacturing, and stock trading, highlighting improved efficiency while preserving safety relative to baseline methods. This framework broadens conformal prediction from uncertainty quantification to decision-level guarantees, with potential impact in control, reinforcement learning, and logistics where decision quality matters most.

Abstract

Paper Structure (9 sections, 3 theorems, 20 equations, 5 figures, 1 table)

This paper contains 9 sections, 3 theorems, 20 equations, 5 figures, 1 table.

Introduction
Related Work
Conformal Decision Theory
Theory & Conformal Controller Algorithm
Experiments
Robot Navigation in Stanford Drone Dataset
Manufacturing Assembly Line Robot
Stock Trading Agent
Discussion & Conclusion

Key Result

Theorem 1

Consider the following update rule for $\lambda_{1:T}$: where $\eta > 0$ and $\ell_t := \mathcal{L}(D_t^{\lambda_t}(x_t), y_t)$. If $\lambda_1 \geq \lambda^{\rm safe} - \eta$ and $\mathcal{D}_{1:T}$ satisfies Definition def:eventually-safe for a given $K\geq1$ and $\varepsilon^{\rm safe} \leq \varepsilon$, then for any realization of the data, the empiri for all $t \in [K,..., T]$.

Figures (5)

Figure 1: Robot planner using a conformal controller on the Stanford Drone Dataset robicquet2016learning. The future trajectories of humans are predicted online by a machine learning algorithm (not visualized). The robot planner finds an optimal spline through the scene and is penalized for being close to humans. This penalty is proportional to a conformal control variable, $\lambda_t$, which is adjusted online by the conformal controller so the average distance from a human is no less than two meters. The orange, red, and blue curves are the robot trajectory with different planners: the conformal controller, an aggressive planner with $\lambda=0$ (i.e., no reward for avoiding humans), and a conservative planner with a large negative value of $\lambda$ (i.e., a large reward for avoiding humans). The darkness of the lines indicates the passage of time. Illustrative pedestrian trajectories are plotted as arrows; only the yellow pedestrians affect the spline planner. Details in Section \ref{['subsec:sdd']} and videos on https://conformal-decision.github.io$^\dag$ .
Figure 2: Stanford Drone Dataset: Qualitative Results. Visualization of interaction over time (left to right). (Top) With our conformal controller (CC), the robot always makes progress towards its goal while remaining safe, even when blocked by crowds of people. (Bottom) The ACI baseline calibrates the prediction sets. As soon as a mis-prediction happens, ACI expands the prediction sets to obtain coverage, but this frequently blocks the robot from moving anywhere (see $t=10 s$), even though the mis-predictions occurred for a pedestrian who was far away and not interfering with the robot's plan.
Figure 3: Stanford Drone Dataset. (Top) Trajectories of $\lambda_t$ (calibrated by CC) and $\alpha_t$ (from ACI to calibrate sets). When $\alpha_t \leq 0$, ACI returns infinite set and the robot stops. (Bottom) Distance to the nearest human over time. $\lambda_t$ is large when the robot is close to human, while $\alpha_t$ is unrelated. The $\lambda_t$ trajectory is shorter because it reaches the goal faster.
Figure 4: Manufacturing Assembly Line Robot: Quantitative Results. (Left) Illustrative example: Robot must adjust the speed so that it grasps the most items while minimizing grasp failure. (Right) Empirical risk ,$\hat{R}_T$, and average utility (i.e., successful grasps), $\hat{V}_T$ on 1000 runs. Our method is denoted by (CC). Dashed red line is target risk $\varepsilon=0.05$.
Figure 5: Stock Trading: Quantitative Results. All results over 5 year period. The yearly loss threshold $\varepsilon=25\%$. (left) Despite a poor prediction model of return (negative correlation), the CC achieves bounded loss at the user's threshold (bottom, dashed red line overlaps with orange CC line) but is not the best at keeping the return the highest. (right) With a strong prediction model on the return (positive correlation), the CC is able to achieve high yearly returns (second only to Greedy) while simultaneously respecting the loss threshold (which the Greedy violates).

Theorems & Definitions (7)

Definition 1: Eventually Safe
Theorem 1: Conformal Controller
proof : Proof of \ref{['prop:CC']}
Lemma 1.1
proof
Remark 1
Corollary 2

Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

TL;DR

Abstract

Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (7)