Table of Contents
Fetching ...

Should Collaborative Robots be Transparent?

Shahabedin Sagheb, Soham Gandhi, Dylan P. Losey

TL;DR

This work investigates whether transparent robot behavior is always optimal in collaborative tasks where humans and robots share the same objective but the human is uncertain about the robot's type. It formalizes the interaction as a two-player stochastic Bayesian game with an augmented state $(s,b)$ and solves for optimal robot policies using a Harsanyi-Bellman Ad Hoc Coordination framework, yielding conditions under which opacity can be advantageous. The study defines Fully Opaque and Rationally Opaque policies and proves through a 1-DoF example that opacity can be optimal, especially in short-horizon tasks or when human learning is slow, with simulations and two user studies (online autonomous driving and in-person block stacking) supporting higher rewards for opaque partners and no clear negative effect on perceived collaboration. These results imply that withholding information can improve performance in brief interactions, offering practical guidance for designing shared autonomy and human-robot teams across manufacturing, autonomous driving, and assistive robotics. The findings highlight a nuanced trade-off between task efficiency and communicative transparency, suggesting a crossover point where transparency becomes beneficial as interactions lengthen or human learning accelerates.

Abstract

We often assume that robots which collaborate with humans should behave in ways that are transparent (e.g., legible, explainable). These transparent robots intentionally choose actions that convey their internal state to nearby humans: for instance, a transparent robot might exaggerate its trajectory to indicate its goal. But while transparent behavior seems beneficial for human-robot interaction, is it actually optimal? In this paper we consider collaborative settings where the human and robot have the same objective, and the human is uncertain about the robot's type (i.e., the robot's internal state). We extend a recursive combination of Bayesian Nash equilibrium and the Bellman equation to solve for optimal robot policies. Interestingly, we discover that it is not always optimal for collaborative robots to be transparent; instead, human and robot teams can sometimes achieve higher rewards when the robot is opaque. In contrast to transparent robots, opaque robots select actions that withhold information from the human. Our analysis suggests that opaque behavior becomes optimal when either (a) human-robot interactions have a short time horizon or (b) users are slow to learn from the robot's actions. We extend this theoretical analysis to user studies across 43 total participants in both online and in-person settings. We find that -- during short interactions -- users reach higher rewards when working with opaque partners, and subjectively rate opaque robots as about equal to transparent robots. See videos of our experiments here: https://youtu.be/u8q1Z7WHUuI

Should Collaborative Robots be Transparent?

TL;DR

This work investigates whether transparent robot behavior is always optimal in collaborative tasks where humans and robots share the same objective but the human is uncertain about the robot's type. It formalizes the interaction as a two-player stochastic Bayesian game with an augmented state and solves for optimal robot policies using a Harsanyi-Bellman Ad Hoc Coordination framework, yielding conditions under which opacity can be advantageous. The study defines Fully Opaque and Rationally Opaque policies and proves through a 1-DoF example that opacity can be optimal, especially in short-horizon tasks or when human learning is slow, with simulations and two user studies (online autonomous driving and in-person block stacking) supporting higher rewards for opaque partners and no clear negative effect on perceived collaboration. These results imply that withholding information can improve performance in brief interactions, offering practical guidance for designing shared autonomy and human-robot teams across manufacturing, autonomous driving, and assistive robotics. The findings highlight a nuanced trade-off between task efficiency and communicative transparency, suggesting a crossover point where transparency becomes beneficial as interactions lengthen or human learning accelerates.

Abstract

We often assume that robots which collaborate with humans should behave in ways that are transparent (e.g., legible, explainable). These transparent robots intentionally choose actions that convey their internal state to nearby humans: for instance, a transparent robot might exaggerate its trajectory to indicate its goal. But while transparent behavior seems beneficial for human-robot interaction, is it actually optimal? In this paper we consider collaborative settings where the human and robot have the same objective, and the human is uncertain about the robot's type (i.e., the robot's internal state). We extend a recursive combination of Bayesian Nash equilibrium and the Bellman equation to solve for optimal robot policies. Interestingly, we discover that it is not always optimal for collaborative robots to be transparent; instead, human and robot teams can sometimes achieve higher rewards when the robot is opaque. In contrast to transparent robots, opaque robots select actions that withhold information from the human. Our analysis suggests that opaque behavior becomes optimal when either (a) human-robot interactions have a short time horizon or (b) users are slow to learn from the robot's actions. We extend this theoretical analysis to user studies across 43 total participants in both online and in-person settings. We find that -- during short interactions -- users reach higher rewards when working with opaque partners, and subjectively rate opaque robots as about equal to transparent robots. See videos of our experiments here: https://youtu.be/u8q1Z7WHUuI
Paper Structure (17 sections, 6 equations, 8 figures)

This paper contains 17 sections, 6 equations, 8 figures.

Figures (8)

  • Figure 1: Collaborative block-stacking task where the human is uncertain about the robot's internal state $\theta$. Transparent robot actions help the human learn $\theta$ and decide what blocks to add to the tower. However, we find that the costs of this transparent behavior may outweigh its benefits
  • Figure 2: Example of an optimal, fully opaque robot. The system starts at position $s = 0.6$ and prior $b^0 = 0.2$. The robot has two types $\theta$: confused and capable. The confused robot can only move towards the left block. (a) Optimal human and robot solve this stochastic Bayesian game. (b) Optimal robot is paired with a human that takes random actions. Regardless of the human's actions, both capable and confused robots always move towards the left block. Hence, the robot is fully opaque, and the human cannot infer $\theta$ from the optimal robot's actions
  • Figure 3: Simulation results from our $1$-DoF Environment. (a) The human and robot collaborated to reach a goal; the confused robot could only go left while the capable robot could help reach right or left. For each plot we sampled all start states and priors and then calculated the percentage of those augmented states which were opaque; e.g., $50\%$ opaque means that for half of the initial augmented states it was optimal for the robot to withhold its type $\theta$ from the human. (b) We varied the human's learning rate and the total number of timesteps in each interaction. A higher learning rate indicated that the human uncovered $\theta$ more quickly when the actions for each robot type diverged. (c) We also tested a human that used Bayesian inference to update their belief and two bounded memory humans (with learning rates of $0.3$ and $0.7$) that forgot what they had learned after each timestep
  • Figure 4: Simulation results from our robot arm environment. (a) Humans shared control with a robot arm to reach for goals on the table; the capable robot could go towards any goal while the confused robot could only move down and left. The format of our results follows Figure \ref{['fig:sim1d']}. (b) The number of opaque states decreases as the interaction time increases. (c) The number of opaque states also decreases as the human learns more quickly. Note that the Bayesian human is an ideal user that can infer the robot's type from a single timestep; i.e., this human model learns $\theta$ as efficiently as possible. When compared to this ideal human, it is more likely for opaque behavior to be optimal when the robot is collaborating with a forgetful user that follows the bounded memory model. Overall, our results show that opaque behavior is more likely to be optimal during short interactions with suboptimal humans
  • Figure 5: Task results from our online user study. Participants collaborated with a virtual agent to drive a car in Passing, Turning, and Parking environments. Error bars show standard error and an $*$ denotes statistical significance ($p<.05$)
  • ...and 3 more figures