Table of Contents
Fetching ...

Efficient Data Collection for Robotic Manipulation via Compositional Generalization

Jensen Gao, Annie Xie, Ted Xiao, Chelsea Finn, Dorsa Sadigh

TL;DR

The paper tackles enabling broad generalization in robotic manipulation under varied environments with limited data. It hypothesizes that end-to-end imitation policies can exhibit compositional generalization and proposes data collection strategies that exploit this property to reduce data requirements. Through extensive simulation (Factor World) and real-robot experiments, it shows that Stair, L, and Diagonal strategies enable substantial composition, particularly when prior data (BridgeData V2) is available. On a real robot, composition-enabled data collection plus prior data achieves high transfer to unseen environments (up to 77.5% success) compared to baselines (as low as 2.5%), highlighting practical gains for scalable robotics data collection.

Abstract

Data collection has become an increasingly important problem in robotic manipulation, yet there still lacks much understanding of how to effectively collect data to facilitate broad generalization. Recent works on large-scale robotic data collection typically vary many environmental factors of variation (e.g., object types, table textures) during data collection, to cover a diverse range of scenarios. However, they do not explicitly account for the possible compositional abilities of policies trained on the data. If robot policies can compose environmental factors from their data to succeed when encountering unseen factor combinations, we can exploit this to avoid collecting data for situations that composition would address. To investigate this possibility, we conduct thorough empirical studies both in simulation and on a real robot that compare data collection strategies and assess whether visual imitation learning policies can compose environmental factors. We find that policies do exhibit composition, although leveraging prior robotic datasets is critical for this on a real robot. We use these insights to propose better in-domain data collection strategies that exploit composition, which can induce better generalization than naive approaches for the same amount of effort during data collection. We further demonstrate that a real robot policy trained on data from such a strategy achieves a success rate of 77.5% when transferred to entirely new environments that encompass unseen combinations of environmental factors, whereas policies trained using data collected without accounting for environmental variation fail to transfer effectively, with a success rate of only 2.5%. We provide videos at http://iliad.stanford.edu/robot-data-comp/.

Efficient Data Collection for Robotic Manipulation via Compositional Generalization

TL;DR

The paper tackles enabling broad generalization in robotic manipulation under varied environments with limited data. It hypothesizes that end-to-end imitation policies can exhibit compositional generalization and proposes data collection strategies that exploit this property to reduce data requirements. Through extensive simulation (Factor World) and real-robot experiments, it shows that Stair, L, and Diagonal strategies enable substantial composition, particularly when prior data (BridgeData V2) is available. On a real robot, composition-enabled data collection plus prior data achieves high transfer to unseen environments (up to 77.5% success) compared to baselines (as low as 2.5%), highlighting practical gains for scalable robotics data collection.

Abstract

Data collection has become an increasingly important problem in robotic manipulation, yet there still lacks much understanding of how to effectively collect data to facilitate broad generalization. Recent works on large-scale robotic data collection typically vary many environmental factors of variation (e.g., object types, table textures) during data collection, to cover a diverse range of scenarios. However, they do not explicitly account for the possible compositional abilities of policies trained on the data. If robot policies can compose environmental factors from their data to succeed when encountering unseen factor combinations, we can exploit this to avoid collecting data for situations that composition would address. To investigate this possibility, we conduct thorough empirical studies both in simulation and on a real robot that compare data collection strategies and assess whether visual imitation learning policies can compose environmental factors. We find that policies do exhibit composition, although leveraging prior robotic datasets is critical for this on a real robot. We use these insights to propose better in-domain data collection strategies that exploit composition, which can induce better generalization than naive approaches for the same amount of effort during data collection. We further demonstrate that a real robot policy trained on data from such a strategy achieves a success rate of 77.5% when transferred to entirely new environments that encompass unseen combinations of environmental factors, whereas policies trained using data collected without accounting for environmental variation fail to transfer effectively, with a success rate of only 2.5%. We provide videos at http://iliad.stanford.edu/robot-data-comp/.
Paper Structure (36 sections, 1 equation, 17 figures, 7 tables, 3 algorithms)

This paper contains 36 sections, 1 equation, 17 figures, 7 tables, 3 algorithms.

Figures (17)

  • Figure 1: We investigate if robot policies can compose environmental factors of variation (e.g., object types, table heights) in their in-domain training data (1, 2), and if using prior data can be helpful for enforcing such generalization. We propose efficient in-domain data collection strategies guided by the ability of policies to reason about unseen combinations of factor values (3).
  • Figure 2: Visualization of our data collection strategies with $N = 2$ factors. Each axis consists of possible values for a factor. Each green dot indicates that the strategy captures a specific combination of factor values represented by it, and each pink dot represents a combination that compositional generalization may address. We name our strategies based on the patterns in this visualization.
  • Figure 3: Visualization of the Pick Place task. We show combinations of 2 values each for the factors table texture and object position.
  • Figure 4: Simulation results of data collection strategies for Pick Place. We report results where $\mathcal{F}^N$ consists of each possible factor pair ($N = 2$), average results across all pairs, and results where $\mathcal{F}^N$ consists of all factors ($N = 5$). All points within the same subplot use the same amount of demonstrations. The strategies that exploit composition (Stair, L, Diagonal) generally outperform Random, and often approach Complete. Stair generally performs the best, especially in the $N = 5$ setting. Error bars represent standard error across 5 seeds.
  • Figure 5: (left) Compositional success rate of different strategies. (right) Generalization of strategies with increasing dataset sizes.
  • ...and 12 more figures