Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

Aaron O. Feldman; Joseph A. Vincent; Maximilian Adang; Jun En Low; Mac Schwager

Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

Aaron O. Feldman, Joseph A. Vincent, Maximilian Adang, Jun En Low, Mac Schwager

TL;DR

The paper tackles robot safety in settings where formal constraints are hard to specify by learning from sparse human labels. It introduces a conformal-prediction framework around a nearest-neighbor classifier to define a Suspected Unsafe Sublevel region $C(\epsilon)$ that contains at least a fraction $1-\epsilon$ of future unsafe states, without withholding data. The authors develop closed-form calibration, extend to unsafe-safe data, and connect the framework to a practical warning system and a backup safety policy that avoids entering $C(\epsilon)$, demonstrated on quadcopter MPC and visuomotor policies with both simulation and hardware experiments. The approach yields guaranteed miss rates, is interpretable (geometric SUS region), and integrates with representation learning to handle high-dimensional observations, enabling safer policy execution with minimal data and effort.

Abstract

Ensuring robot safety can be challenging; user-defined constraints can miss edge cases, policies can become unsafe even when trained from safe data, and safety can be subjective. Thus, we learn about robot safety by showing policy trajectories to a human who flags unsafe behavior. From this binary feedback, we use the statistical method of conformal prediction to identify a region of states, potentially in learned latent space, guaranteed to contain a user-specified fraction of future policy errors. Our method is sample-efficient, as it builds on nearest neighbor classification and avoids withholding data as is common with conformal prediction. By alerting if the robot reaches the suspected unsafe region, we obtain a warning system that mimics the human's safety preferences with guaranteed miss rate. From video labeling, our system can detect when a quadcopter visuomotor policy will fail to steer through a designated gate. We present an approach for policy improvement by avoiding the suspected unsafe region. With it we improve a model predictive controller's safety, as shown in experimental testing with 30 quadcopter flights across 6 navigation tasks. Code and videos are provided.

Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

TL;DR

that contains at least a fraction

of future unsafe states, without withholding data. The authors develop closed-form calibration, extend to unsafe-safe data, and connect the framework to a practical warning system and a backup safety policy that avoids entering

, demonstrated on quadcopter MPC and visuomotor policies with both simulation and hardware experiments. The approach yields guaranteed miss rates, is interpretable (geometric SUS region), and integrates with representation learning to handle high-dimensional observations, enabling safer policy execution with minimal data and effort.

Abstract

Paper Structure (30 sections, 8 theorems, 66 equations, 16 figures)

This paper contains 30 sections, 8 theorems, 66 equations, 16 figures.

Introduction
Literature Review
Learning Robot Safety
Learning Constraints from Expert Demonstrations
Learning Safety from Binary Feedback
Learning Safety from Human Interaction
Conformal Prediction in Robotics
Conformal Prediction for Anomaly Detection
Conformal Prediction for Collision Avoidance
Overview of Conformal Prediction
Problem Setting and Approach Overview
Nearest Neighbor Conformal Prediction
Main Results
Unsafe-Safe Nearest Neighbor Extension
Probabilistic Interpretation
...and 15 more sections

Key Result

Theorem 5.1

Let $D = \{x_1, ..., x_N\}$ be states drawn IID from any, possibly unknown, distribution $F$ and suppose we are given a miscoverage rate $\epsilon$. Let $\alpha_1, ..., \alpha_N$ be the intra-data nearest neighbor values and let $k = k(\epsilon) \leq N$ (Eq. eq: k_epsilon). Using $r = \alpha_{(k)}$ to form satisfies for new states $x_{N+1} \sim F$

Figures (16)

Figure 1: Overview of our approach to learning robot safety from sparse human feedback. Given a robot policy, we repeatedly demonstrate it to a human, possibly in simulation, and have them terminate any trajectories which they deem unsafe. Using these binary labels, we apply conformal prediction to calibrate a nearest neighbor classifier and determine a suspected unsafe region $C(\epsilon)$ containing at least $1-\epsilon$ of states that would be deemed unsafe by the human. Using $C(\epsilon)$, we can improve the original policy's safety via an auxiliary warning system or a backup safety controller. The graphics show experiment results wherein we use human feedback to (i) develop a warning system for a visuomotor quadcopter policy and (ii) increase the safety of a quadcopter model predictive controller.
Figure 2: Visualizing the geometry of the unsafe-only (Eq. \ref{['eq: score func']}) and unsafe-safe (Eq. \ref{['eq: asym_score_func']}) conformal covering set $C(\epsilon)$. We sample $N = 30$ unsafe points (shown in red) and $M = 100$ safe points (shown in blue) and request miscoverage of $\epsilon = 0.1$. In the top subfigure, we plot $C(\epsilon)$ for the unsafe-only case using Euclidean distance. In the bottom subfigure, we use the difference of squared Euclidean distance for the unsafe-safe case (Eq. \ref{['eq: two_sample_special']}) resulting in a union of polyhedra (see subsection \ref{['subsec: two_sample']}).
Figure 3: Empirical verification of the theoretical coverage guarantees. For both the unsafe-only and unsafe-safe cases, we construct $C(\epsilon)$$1000$ times with fresh data (using the same setup of Figure \ref{['fig: geom_fig']}) and evaluate coverage using $1000$ new unsafe test points. We plot a histogram of the coverage over these repetitions and plot the the average coverage as a green dashed line. The average coverage lies between the theoretical lower (Theorem \ref{['thm: main_cp']}) and upper bounds (Theorem \ref{['thm: overcoverage_sym']} for the unsafe-only case and Theorem \ref{['thm: overcoverage_asym']} for the unsafe-safe case), shown respectively as red and blue dashed lines.
Figure 4: Visualizing the $p$-value associated with $C(\epsilon)$. Using the same setup as in Fig. \ref{['fig: geom_fig']}, we vary $\epsilon$ to visualize the $p$-value level sets for the unsafe-only (top) and unsafe-safe (bottom) cases. While both the unsafe-only and unsafe-safe cases provide valid coverage of at least $1-\epsilon$, based on the $p$-value visualizations we qualitatively observe that the unsafe-safe approach better distinguishes the unsafe and safe samples.
Figure 5: Visualization of our warning system augmenting a quadcopter MPC policy that sometimes collides with unknown obstacles. We specify $\epsilon = 0.2$ and use $N = 25$ errors for unsafe-safe nearest neighbor conformal prediction. The top subfigure shows the original trajectories, which collide (marked in red) $52\%$ of the time. The middle subfigure shows a 3D visualization of the SUS region $C(\epsilon)$ which actually exists in the 9D state space. The bottom subfigure shows the warning system tested for $50$ new trajectories. States triggering an alert are marked with an orange 'x.' Terminating upon alert drops the error rate to $4\%$.
...and 11 more figures

Theorems & Definitions (18)

Theorem 5.1: Closed-Form Conformal Prediction
Theorem 5.2: Geometric Coverage Set
Theorem 5.3: Overcoverage Bound: Symmetric Case
Theorem 5.4: Overcoverage Bound: Asymmetric Case
Definition 5.1: Conformal $p$-Value
Theorem 6.1: Warning Miss Rate
proof
Corollary 6.1.1
proof
Lemma 1: Conformal Guarantee with Ties
...and 8 more

Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

TL;DR

Abstract

Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (16)

Theorems & Definitions (18)