Table of Contents
Fetching ...

RoboCrowd: Scaling Robot Data Collection through Crowdsourcing

Suvir Mirchandani, David D. Yuan, Kaylee Burns, Md Sazzad Islam, Tony Z. Zhao, Chelsea Finn, Dorsa Sadigh

TL;DR

This paper tackles the high data demands of imitation learning for robot policies by introducing RoboCrowd, a crowdsourcing framework that leverages three incentive classes—material rewards, intrinsic interest, and social comparison—to collect in-person robot demonstrations. Built on the ALOHA bimanual teleoperation platform, RoboCrowd emphasizes public accessibility, safety, intuitiveness, and gamification, and is validated through a two-week field deployment that gathered 817 episodes from 231 users. The authors show that crowdsourced data can serve as effective pre-training data for policies fine-tuned on expert demonstrations, achieving up to 20% improvements in policy performance and providing diverse behaviors that improve downstream transfer. These results suggest RoboCrowd can substantially reduce the burden of robot data collection while enabling scalable, diverse, and high-quality demonstrations in real-world settings.

Abstract

In recent years, imitation learning from large-scale human demonstrations has emerged as a promising paradigm for training robot policies. However, the burden of collecting large quantities of human demonstrations is significant in terms of collection time and the need for access to expert operators. We introduce a new data collection paradigm, RoboCrowd, which distributes the workload by utilizing crowdsourcing principles and incentive design. RoboCrowd helps enable scalable data collection and facilitates more efficient learning of robot policies. We build RoboCrowd on top of ALOHA (Zhao et al. 2023) -- a bimanual platform that supports data collection via puppeteering -- to explore the design space for crowdsourcing in-person demonstrations in a public environment. We propose three classes of incentive mechanisms to appeal to users' varying sources of motivation for interacting with the system: material rewards, intrinsic interest, and social comparison. We instantiate these incentives through tasks that include physical rewards, engaging or challenging manipulations, as well as gamification elements such as a leaderboard. We conduct a large-scale, two-week field experiment in which the platform is situated in a university cafe. We observe significant engagement with the system -- over 200 individuals independently volunteered to provide a total of over 800 interaction episodes. Our findings validate the proposed incentives as mechanisms for shaping users' data quantity and quality. Further, we demonstrate that the crowdsourced data can serve as useful pre-training data for policies fine-tuned on expert demonstrations -- boosting performance up to 20% compared to when this data is not available. These results suggest the potential for RoboCrowd to reduce the burden of robot data collection by carefully implementing crowdsourcing and incentive design principles.

RoboCrowd: Scaling Robot Data Collection through Crowdsourcing

TL;DR

This paper tackles the high data demands of imitation learning for robot policies by introducing RoboCrowd, a crowdsourcing framework that leverages three incentive classes—material rewards, intrinsic interest, and social comparison—to collect in-person robot demonstrations. Built on the ALOHA bimanual teleoperation platform, RoboCrowd emphasizes public accessibility, safety, intuitiveness, and gamification, and is validated through a two-week field deployment that gathered 817 episodes from 231 users. The authors show that crowdsourced data can serve as effective pre-training data for policies fine-tuned on expert demonstrations, achieving up to 20% improvements in policy performance and providing diverse behaviors that improve downstream transfer. These results suggest RoboCrowd can substantially reduce the burden of robot data collection while enabling scalable, diverse, and high-quality demonstrations in real-world settings.

Abstract

In recent years, imitation learning from large-scale human demonstrations has emerged as a promising paradigm for training robot policies. However, the burden of collecting large quantities of human demonstrations is significant in terms of collection time and the need for access to expert operators. We introduce a new data collection paradigm, RoboCrowd, which distributes the workload by utilizing crowdsourcing principles and incentive design. RoboCrowd helps enable scalable data collection and facilitates more efficient learning of robot policies. We build RoboCrowd on top of ALOHA (Zhao et al. 2023) -- a bimanual platform that supports data collection via puppeteering -- to explore the design space for crowdsourcing in-person demonstrations in a public environment. We propose three classes of incentive mechanisms to appeal to users' varying sources of motivation for interacting with the system: material rewards, intrinsic interest, and social comparison. We instantiate these incentives through tasks that include physical rewards, engaging or challenging manipulations, as well as gamification elements such as a leaderboard. We conduct a large-scale, two-week field experiment in which the platform is situated in a university cafe. We observe significant engagement with the system -- over 200 individuals independently volunteered to provide a total of over 800 interaction episodes. Our findings validate the proposed incentives as mechanisms for shaping users' data quantity and quality. Further, we demonstrate that the crowdsourced data can serve as useful pre-training data for policies fine-tuned on expert demonstrations -- boosting performance up to 20% compared to when this data is not available. These results suggest the potential for RoboCrowd to reduce the burden of robot data collection by carefully implementing crowdsourcing and incentive design principles.

Paper Structure

This paper contains 42 sections, 15 figures, 10 tables.

Figures (15)

  • Figure 1: Example of incentivizing demonstrations in RoboCrowd. The principal $P$ consists of a robot teleoperation setup, a designer, and a scene they have designed. The scene contains tasks that an agent $I$ (a crowd user) can attempt, guided by incentives put in place by the designer. For example, a material reward---e.g., a candy in a bin---can motivate $I$ to produce a successful trajectory for a bin-picking task, which the designer can add to a dataset.
  • Figure 2: System Overview. (Left) RoboCrowd uses the ALOHA robot Zhao2023LearningFB, a bimanual teleoperation platform wherein users control 2 ViperX follower arms by puppeteering via 2 WidowX leader arms. Users can perform tasks in scenes put in place by the scene designer; tasks may include physical rewards that the user can bring to the End Zone and access via the Handover Region. (Right) Users are guided by a GUI on a tablet. Functionalities include an Interactive Tutorial to get acquainted with RoboCrowd, a Task Page to select among tasks, and a Leaderboard where users can compare their scores. For additional details, please see \ref{['appx:software']}.
  • Figure 3: Scene Setup. Illustration of BinScene, Bin+DispenserScene, and Bin+ZiplocScene, and the objects relevant to our 6 tasks (hi-chew, tootsie-roll, hershey-kiss, jelly-bean, hi-chew-bin, hi-chew-ziploc).
  • Figure 4: Dataset composition by number of time steps for each of our three scenes. Different hues indicate different tasks. Tasks receive quality scores from 1 to 3 (higher is better) which are also indicated by brighter shades. Tutorial data receives a score of 1 or 2. Play data always receives a score of 0.
  • Figure 5: Quantity and quality by leaderboard use. Violin plot showing the distribution of quantity and quality of demonstrations for users who did and did not visit the leaderboard.
  • ...and 10 more figures