Table of Contents
Fetching ...

SARI: Shared Autonomy across Repeated Interaction

Ananth Jonnavittula, Shaunak A. Mehta, Dylan P. Losey

TL;DR

SARI, an algorithm that recognizes the human’s task, replicates similar demonstrations, and returns control when unsure is introduced, and the results indicate that learning shared autonomy across repeated interactions matches existing approaches for known tasks and outperforms baselines on new tasks.

Abstract

Assistive robot arms try to help their users perform everyday tasks. One way robots can provide this assistance is shared autonomy. Within shared autonomy, both the human and robot maintain control over the robot's motion: as the robot becomes confident it understands what the human wants, it intervenes to automate the task. But how does the robot know these tasks in the first place? State-of-the-art approaches to shared autonomy often rely on prior knowledge. For instance, the robot may need to know the human's potential goals beforehand. During long-term interaction these methods will inevitably break down -- sooner or later the human will attempt to perform a task that the robot does not expect. Accordingly, in this paper we formulate an alternate approach to shared autonomy that learns assistance from scratch. Our insight is that operators repeat important tasks on a daily basis (e.g., opening the fridge, making coffee). Instead of relying on prior knowledge, we therefore take advantage of these repeated interactions to learn assistive policies. We introduce SARI, an algorithm that recognizes the human's task, replicates similar demonstrations, and returns control when unsure. We then combine learning with control to demonstrate that the error of our approach is uniformly ultimately bounded. We perform simulations to support this error bound, compare our approach to imitation learning baselines, and explore its capacity to assist for an increasing number of tasks. Finally, we conduct three user studies with industry-standard methods and shared autonomy baselines, including a pilot test with a disabled user. Our results indicate that learning shared autonomy across repeated interactions matches existing approaches for known tasks and outperforms baselines on new tasks. See videos of our user studies here: https://youtu.be/3vE4omSvLvc

SARI: Shared Autonomy across Repeated Interaction

TL;DR

SARI, an algorithm that recognizes the human’s task, replicates similar demonstrations, and returns control when unsure is introduced, and the results indicate that learning shared autonomy across repeated interactions matches existing approaches for known tasks and outperforms baselines on new tasks.

Abstract

Assistive robot arms try to help their users perform everyday tasks. One way robots can provide this assistance is shared autonomy. Within shared autonomy, both the human and robot maintain control over the robot's motion: as the robot becomes confident it understands what the human wants, it intervenes to automate the task. But how does the robot know these tasks in the first place? State-of-the-art approaches to shared autonomy often rely on prior knowledge. For instance, the robot may need to know the human's potential goals beforehand. During long-term interaction these methods will inevitably break down -- sooner or later the human will attempt to perform a task that the robot does not expect. Accordingly, in this paper we formulate an alternate approach to shared autonomy that learns assistance from scratch. Our insight is that operators repeat important tasks on a daily basis (e.g., opening the fridge, making coffee). Instead of relying on prior knowledge, we therefore take advantage of these repeated interactions to learn assistive policies. We introduce SARI, an algorithm that recognizes the human's task, replicates similar demonstrations, and returns control when unsure. We then combine learning with control to demonstrate that the error of our approach is uniformly ultimately bounded. We perform simulations to support this error bound, compare our approach to imitation learning baselines, and explore its capacity to assist for an increasing number of tasks. Finally, we conduct three user studies with industry-standard methods and shared autonomy baselines, including a pilot test with a disabled user. Our results indicate that learning shared autonomy across repeated interactions matches existing approaches for known tasks and outperforms baselines on new tasks. See videos of our user studies here: https://youtu.be/3vE4omSvLvc
Paper Structure (23 sections, 38 equations, 20 figures)

This paper contains 23 sections, 38 equations, 20 figures.

Figures (20)

  • Figure 1: User teleoperating an assistive robot arm to open their fridge door. The robot does not have any prior knowledge about this task; however, the human and robot have completed similar tasks many times before. Instead of making the human guide the robot through every step of this task, we hypothesize that robot arms can learn to assist humans and share autonomy by exploiting the repeated nature of everyday tasks.
  • Figure 2: We separate prior work on shared autonomy for assistive robot arms into two groups. (Left) Some methods are given a discrete set of possible goals, and infer the human's goal from these discrete options. (Right) Other methods learn to map the human's joystick inputs to constrained, task-relevant motions. Although both shared autonomy algorithms help this human reach for the cups, neither can assist the human for a new, unexpected task (like opening the fridge).
  • Figure 3: Outline of SARI, our proposed algorithmic framework for learning to share autonomy across repeated interaction. (Left) The robot embeds the human's behavior $\tau^i$ during the current interaction to a distribution over latent tasks $z$. (Middle) The robot then chooses assistive actions $a_{\mathcal{R}}$ conditioned on its state $s$ and latent task $z$. The assistive policy $\pi_{\mathcal{R}}$ is trained to match the user's behavior from previous interactions. (Right) To decide whether or not to trust this assistive action, the robot turns to a discriminator $\mathcal{C}$. The discriminator assesses whether the current interaction $\tau^i$ is similar to any previously seen interaction: if so, the robot increases autonomy. In this example the robot remembers how the human has opened the fridge in the past, and assists for that task. But when the human does something new (reaching for the cup) the robot realizes that it does not know how to help, and arbitrates control back to the human.
  • Figure 4: Error bounds for the $1$-DoF system as a function of human noise. All values are in meters. Plots generated using Equation (\ref{['eq:T6']}) and Equation (\ref{['eq:T7']}) with $\beta_{max} = 1$. (Left) For a fixed $\sigma_{\mathcal{H}}=1$ we increase $\sigma_{\mathcal{D}}$. This captures a human that provided increasingly noisy inputs during past interactions when they were reaching for the known goal $g$. (Right) For a fixed $\sigma_{\mathcal{D}}=1$ we increase $\sigma_{\mathcal{H}}$. This corresponds to a human that provides increasingly noisy inputs during the current interaction while reaching for the new goal $g^*$. We conclude that $\sigma_{\mathcal{D}}$ and $\sigma_{\mathcal{H}}$ have opposite effects on the theoretical error bound.
  • Figure 5: Error bound and experimental results for a $1$-DoF SARI system. All values are in meters. Here a simulated Gaussian human provided $250$ demonstrations reaching for their original goal $g$. These demonstrations were used to train the SARI algorithm; at test time the simulated human reached for a series of new goals $g^*$ with SARI assistance. For each $g^*$ we collected $10,000$ runs --- the shaded region is the standard deviation across these runs. (Center) While reaching for the previous goal $g$ and new goal $g^*$ the human had noise $\sigma_{\mathcal{D}}=\sigma_{\mathcal{H}}=1$. For all choices of $g^*$ we have that $\mathbb{E}[\beta] < \beta_{max}$ in Equation (\ref{['eq:T5']}), and thus the theoretical bound is Equation (\ref{['eq:T7']}). (Right) We choose $\sigma_{\mathcal{D}} = \sigma_{\mathcal{H}}=0.1$ and had two different theoretical error bounds: When $g^*$ is close to $g$ then $\mathbb{E}[\beta] \geq \beta_{max}$ and Equation (\ref{['eq:T6']}) applies; but as $g^*$ get farther from $g$ we have that $\mathbb{E}[\beta] < \beta_{max}$, and thus the bound is Equation (\ref{['eq:T7']}). The bound appears tight when $\mathbb{E}[\beta] < \beta_{max}$ and more conservative when $\mathbb{E}[\beta] \geq \beta_{max}$.
  • ...and 15 more figures