Table of Contents
Fetching ...

EVE: Enabling Anyone to Train Robots using Augmented Reality

Jun Wang, Chun-Cheng Chang, Jiafei Duan, Dieter Fox, Ranjay Krishna

TL;DR

EVE introduces an iOS augmented-reality app that democratizes robot data collection by allowing everyday users to train real robots without a physical robot. It combines three AR visualizations (AR-KT, Path History, Invisible Robot) with seven enhancements (hand-position projection, joint-constraints, realistic trajectories, replay, dynamic camera, reliable gripper control, and reselection of instantiation) to improve usability, verifiability, and data quality. In formative and evaluative studies, EVE outperformed AR2-D2 and matched kinesthetic teaching on several metrics, and data collected with EVE yielded higher real-world policy performance (30% vs 20% in a toggle-switch task after 30k iterations). The work demonstrates the potential of AR-based demonstration collection to empower personalized robotics in everyday settings and outlines concrete future directions, including bimanual control, mobile robots, SLAM-enabled multi-view collection, and hardware diversification.

Abstract

The increasing affordability of robot hardware is accelerating the integration of robots into everyday activities. However, training a robot to automate a task requires expensive trajectory data where a trained human annotator moves a physical robot to train it. Consequently, only those with access to robots produce demonstrations to train robots. In this work, we remove this restriction with EVE, an iOS app that enables everyday users to train robots using intuitive augmented reality visualizations, without needing a physical robot. With EVE, users can collect demonstrations by specifying waypoints with their hands, visually inspecting the environment for obstacles, modifying existing waypoints, and verifying collected trajectories. In a user study (N=14, D=30) consisting of three common tabletop tasks, EVE outperformed three state-of-the-art interfaces in success rate and was comparable to kinesthetic teaching-physically moving a physical robot-in completion time, usability, motion intent communication, enjoyment, and preference (mean of p=0.30). EVE allows users to train robots for personalized tasks, such as sorting desk supplies, organizing ingredients, or setting up board games. We conclude by enumerating limitations and design considerations for future AR-based demonstration collection systems for robotics.

EVE: Enabling Anyone to Train Robots using Augmented Reality

TL;DR

EVE introduces an iOS augmented-reality app that democratizes robot data collection by allowing everyday users to train real robots without a physical robot. It combines three AR visualizations (AR-KT, Path History, Invisible Robot) with seven enhancements (hand-position projection, joint-constraints, realistic trajectories, replay, dynamic camera, reliable gripper control, and reselection of instantiation) to improve usability, verifiability, and data quality. In formative and evaluative studies, EVE outperformed AR2-D2 and matched kinesthetic teaching on several metrics, and data collected with EVE yielded higher real-world policy performance (30% vs 20% in a toggle-switch task after 30k iterations). The work demonstrates the potential of AR-based demonstration collection to empower personalized robotics in everyday settings and outlines concrete future directions, including bimanual control, mobile robots, SLAM-enabled multi-view collection, and hardware diversification.

Abstract

The increasing affordability of robot hardware is accelerating the integration of robots into everyday activities. However, training a robot to automate a task requires expensive trajectory data where a trained human annotator moves a physical robot to train it. Consequently, only those with access to robots produce demonstrations to train robots. In this work, we remove this restriction with EVE, an iOS app that enables everyday users to train robots using intuitive augmented reality visualizations, without needing a physical robot. With EVE, users can collect demonstrations by specifying waypoints with their hands, visually inspecting the environment for obstacles, modifying existing waypoints, and verifying collected trajectories. In a user study (N=14, D=30) consisting of three common tabletop tasks, EVE outperformed three state-of-the-art interfaces in success rate and was comparable to kinesthetic teaching-physically moving a physical robot-in completion time, usability, motion intent communication, enjoyment, and preference (mean of p=0.30). EVE allows users to train robots for personalized tasks, such as sorting desk supplies, organizing ingredients, or setting up board games. We conclude by enumerating limitations and design considerations for future AR-based demonstration collection systems for robotics.
Paper Structure (29 sections, 5 figures, 4 tables)

This paper contains 29 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: EVE allows everyday users to collect data to train a real robot using intuitive augmented reality (AR) visualizations without needing a physical robot. Our application enables users to move the AR robot by setting waypoints with hand gestures, visually inspect the real-world environment to avoid obstacles, modify the collected trajectory, and verify the data collection by replaying the task with the AR robot. Videos of AR visualizations, features, and real-world deployment are available at https://junwang0510.github.io/EVE/.
  • Figure 2: The initial prototype of EVE includes three new AR visualizations. AR kinesthetic teaching allows the robot to track the user's hand movements in real-time. Path history displays the trajectory along collected waypoints, enabling users to revert to previous waypoints. Invisible robot allows users toggle off the robot's body, showing a cylinder that represents the robot's end effector position and gripper state.
  • Figure 3: Overview of the evaluation user study and prototype 2 of EVE. Left: The study included three common tabletop tasks: toggling a switch, sorting food, and sweeping the table. Right: An example of data collection for the food sorting task using EVE final prototype.
  • Figure 4: The mean task success rate (%) for all interfaces in the evaluation user study, which included three tasks with 10 trials each, indicates that EVE achieved the highest success rate across all tasks.
  • Figure 5: The mean and the standard deviation of the remaining time (seconds) for successfully completing one demonstration for each task. EVE performed comparably to kinesthetic teaching, with an average difference of 5.1 seconds across the three tasks.