Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

Justin Lidard; Hang Pham; Ariel Bachman; Bryan Boateng; Anirudha Majumdar

Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

Justin Lidard, Hang Pham, Ariel Bachman, Bryan Boateng, Anirudha Majumdar

TL;DR

RCIP addresses risk-aware human-robot interaction under latent, multi-modal intents by coupling set-valued intent prediction with statistical risk calibration. It frames action selection as a sequence-level multi-hypothesis testing problem, providing finite-sample guarantees on cumulative loss while enabling flexible autonomy through tunable parameters $(\lambda,\theta)$ and risk budgets $(\alpha_1,\dots,\alpha_K)$. The approach supports both task-specific and zero-shot intent predictors and extends conformal prediction to sequences with a calibration stage for multi-risk objectives. Empirical results across four domains (simulation and hardware) show that RCIP preserves high task success while substantially reducing human input compared to baselines, demonstrating practical certifiable autonomy in diverse interactive settings.

Abstract

Tasks where robots must anticipate human intent, such as navigating around a cluttered home or sorting everyday items, are challenging because they exhibit a wide range of valid actions that lead to similar outcomes. Moreover, zero-shot cooperation between human-robot partners is an especially challenging problem because it requires the robot to infer and adapt on the fly to a latent human intent, which could vary significantly from human to human. Recently, deep learned motion prediction models have shown promising results in predicting human intent but are prone to being confidently incorrect. In this work, we present Risk-Calibrated Interactive Planning (RCIP), which is a framework for measuring and calibrating risk associated with uncertain action selection in human-robot cooperation, with the fundamental idea that the robot should ask for human clarification when the risk associated with the uncertainty in the human's intent cannot be controlled. RCIP builds on the theory of set-valued risk calibration to provide a finite-sample statistical guarantee on the cumulative loss incurred by the robot while minimizing the cost of human clarification in complex multi-step settings. Our main insight is to frame the risk control problem as a sequence-level multi-hypothesis testing problem, allowing efficient calibration using a low-dimensional parameter that controls a pre-trained risk-aware policy. Experiments across a variety of simulated and real-world environments demonstrate RCIP's ability to predict and adapt to a diverse set of dynamic human intents.

Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

TL;DR

and risk budgets

. The approach supports both task-specific and zero-shot intent predictors and extends conformal prediction to sequences with a calibration stage for multi-risk objectives. Empirical results across four domains (simulation and hardware) show that RCIP preserves high task success while substantially reducing human input compared to baselines, demonstrating practical certifiable autonomy in diverse interactive settings.

Abstract

Paper Structure (26 sections, 5 theorems, 25 equations, 14 figures, 8 tables)

This paper contains 26 sections, 5 theorems, 25 equations, 14 figures, 8 tables.

Introduction
Related Work
Contingency Planning and Priviledged Learning
Human Intent Prediction
Conformal Prediction and Empirical Risk Control
Problem Formulation
Dynamic Programming with Intent Uncertainty
Risk-Calibrated Interactive Planning
Goal: Certifiable Autonomy
Approach
Background: Statistical Risk Calibration
Single-Step, Single-Risk Control
Single-Step, Multi-Risk Control
Multi-Step, Single-Risk Control
Multi-Step, Multi-Risk Control
...and 11 more sections

Key Result

Proposition 1

Consider a single-step setting ($T=1$) where we use risk calibration parameters $(\lambda, \theta) \in \hat{\Phi}$ to generate predicted action sets and seek help whenever the prediction set is not a singleton (cf. Sec. sec:RCIP overview). If the FWER-controlling parameter set $\hat{\Phi}$ is non-em

Figures (14)

Figure 2: RCIP formulates interactive planning as a multi-hypothesis risk control problem. Using a small set of calibration scenarios, RCIP computes step-wise prediction losses to form an aggregate emperical risk estimate. Using a risk limit, for each pair $(\lambda, \theta)$ of prediction thresholds and tunable model parameters, RCIP evaluates the hypothesis that the test set risk is above the limit. Thus, for all hypotheses that are rejected, the test set risk satisfies the threshold (with high probability).
Figure 3: Multi-step RCIP is applied in Hallway Navigation. The robot car (blue) and human car (red) are tasked with navigating to their respective goal states (large blue and red rectangles). The human car is constrained via its intent to pass through one of the five hallways (highlighted in red). The blue car does not observe the human's intent during evaluation.
Figure 4: (Left, Center) Multi-step RCIP is applied in Social Navigation. The human's trajectory is shown in pink, and the robot's trajectory is shown in blue. The human's possible goal objects are shown in orange. (Right) Single-step RCIP is applied in Bimanual Sorting. KnowNo, which generates plans in open-ended language, may generate a plan that is technically correct, but ambiguous to execute for a language-conditioned policy (both the blue and white bin have a pot). RCIP instead guarantees that the human's intent is satisfied via constraint satisfaction with the intent-conditioned planner.
Figure 5: Baseline comparison for RCIP versus other set-valued predictors for all tasks. RCIP consistently requires less help to achieve a specified plan success rate than other baseline methods. RCIP provides a framework for tuning model parameters to achieve risk control, versus other methods that assume that model parameters are held fixed: KnowNo ren2023robots, Simple Set, Entropy Set, and No Help.
Figure 6: Ablation study on the effect of action miscoverage and help rate risk limits versus FWER-controlling parameter set size for RCIP on Hallway Navigation using $\alpha_\text{cov} \in [0, 0.45]$ and $\alpha_\text{help} \in [0, 1]$. The color denotes the size of the set of FWER-controlling parameters $\hat{\Phi}$, with empty (infeasible) sets taking a size of zero.
...and 9 more figures

Theorems & Definitions (11)

Remark
Remark
Proposition 1
proof
Proposition 2
proof
Proposition 3
proof
Proposition 4
proof
...and 1 more

Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

TL;DR

Abstract

Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (11)