Table of Contents
Fetching ...

Quality over Quantity: Demonstration Curation via Influence Functions for Data-Centric Robot Learning

Haeone Lee, Taywon Min, Junsu Kim, Sinjae Kang, Fangchen Liu, Lerrel Pinto, Kimin Lee

TL;DR

Quality over Quantity is proposed, a grounded and systematic approach to identifying high-quality data by defining data quality as the contribution of each training sample to reducing loss on validation demonstrations and introducing influence functions, which quantify the impact of individual training samples on model performance.

Abstract

Learning from demonstrations has emerged as a promising paradigm for end-to-end robot control, particularly when scaled to diverse and large datasets. However, the quality of demonstration data, often collected through human teleoperation, remains a critical bottleneck for effective data-driven robot learning. Human errors, operational constraints, and teleoperator variability introduce noise and suboptimal behaviors, making data curation essential yet largely manual and heuristic-driven. In this work, we propose Quality over Quantity (QoQ), a grounded and systematic approach to identifying high-quality data by defining data quality as the contribution of each training sample to reducing loss on validation demonstrations. To efficiently estimate this contribution, we leverage influence functions, which quantify the impact of individual training samples on model performance. We further introduce two key techniques to adapt influence functions for robot demonstrations: (i) using maximum influence across validation samples to capture the most relevant state-action pairs, and (ii) aggregating influence scores of state-action pairs within the same trajectory to reduce noise and improve data coverage. Experiments in both simulated and real-world settings show that QoQ consistently improves policy performances over prior data selection methods.

Quality over Quantity: Demonstration Curation via Influence Functions for Data-Centric Robot Learning

TL;DR

Quality over Quantity is proposed, a grounded and systematic approach to identifying high-quality data by defining data quality as the contribution of each training sample to reducing loss on validation demonstrations and introducing influence functions, which quantify the impact of individual training samples on model performance.

Abstract

Learning from demonstrations has emerged as a promising paradigm for end-to-end robot control, particularly when scaled to diverse and large datasets. However, the quality of demonstration data, often collected through human teleoperation, remains a critical bottleneck for effective data-driven robot learning. Human errors, operational constraints, and teleoperator variability introduce noise and suboptimal behaviors, making data curation essential yet largely manual and heuristic-driven. In this work, we propose Quality over Quantity (QoQ), a grounded and systematic approach to identifying high-quality data by defining data quality as the contribution of each training sample to reducing loss on validation demonstrations. To efficiently estimate this contribution, we leverage influence functions, which quantify the impact of individual training samples on model performance. We further introduce two key techniques to adapt influence functions for robot demonstrations: (i) using maximum influence across validation samples to capture the most relevant state-action pairs, and (ii) aggregating influence scores of state-action pairs within the same trajectory to reduce noise and improve data coverage. Experiments in both simulated and real-world settings show that QoQ consistently improves policy performances over prior data selection methods.
Paper Structure (18 sections, 8 equations, 6 figures, 4 tables)

This paper contains 18 sections, 8 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Illustration of high-quality data. Our robot data curation method, QoQ, selects trajectories based on their direct contribution to policy performance, using influence functions koh2017understanding to quantify this impact. Specifically, we measure the similarity between gradients of validation data (blue) and those of training data; higher similarity (green) indicates that including a particular state-action pair will effectively reduce validation loss. By prioritizing these high-impact data points, we systematically identify and preserve the most valuable training data that drive performance improvement.
  • Figure 2: Success rate for simulation and real robot experiments. QoQ outperforms all baselines in simulation and real robot experiments by detecting helpful trajectories. We report the mean and standard deviation across 5 runs (simulation) and 3 runs (real robot).
  • Figure 3: Visualization of experiment environments across simulation and real robot setups, including a single task of grasping a banana and multi-object pick-and-place, and open cabinet task.
  • Figure 4: Droid dataset curation accuracy (%). Compared to baselines, QoQ maintains high curation accuracy in DROID dataset, which consists of different domains and object locations.
  • Figure 5: Open cabinet policy success rate.
  • ...and 1 more figures