Table of Contents
Fetching ...

AnyTask: an Automated Task and Data Generation Framework for Advancing Sim-to-Real Policy Learning

Ran Gong, Xiaohan Zhang, Jinghuan Shang, Maria Vittoria Minniti, Jigarkumar Patel, Valerio Pepe, Riedana Yan, Ahmet Gundogdu, Ivan Kapelyukh, Ali Abbas, Xiaoqiang Yan, Harsh Patel, Laura Herlant, Karl Schmeckpeper

TL;DR

AnyTask tackles the robot data bottleneck by automating task design and data generation through a scalable AI-powered pipeline that leverages massively parallel simulation and foundation models. It introduces three agents—ViPR, ViPR-Eureka, and ViPR-RL—that generate diverse expert demonstrations using task–motion planning, dense rewards, and hybrid planning–RL strategies. The framework combines an object database, automated task/simulation generation, and dense trajectory annotations to produce high-quality synthetic data, enabling zero-shot sim-to-real policies that partially transfer to real robots. Empirical results show strong data generation efficiency, diverse task coverage, and notable, though evolving, real-world transfer performance, highlighting both the promise and current limitations of fully synthetic robotic training pipelines.

Abstract

Generalist robot learning remains constrained by data: large-scale, diverse, and high-quality interaction data are expensive to collect in the real world. While simulation has become a promising way for scaling up data collection, the related tasks, including simulation task design, task-aware scene generation, expert demonstration synthesis, and sim-to-real transfer, still demand substantial human effort. We present AnyTask, an automated framework that pairs massively parallel GPU simulation with foundation models to design diverse manipulation tasks and synthesize robot data. We introduce three AnyTask agents for generating expert demonstrations aiming to solve as many tasks as possible: 1) ViPR, a novel task and motion planning agent with VLM-in-the-loop Parallel Refinement; 2) ViPR-Eureka, a reinforcement learning agent with generated dense rewards and LLM-guided contact sampling; 3) ViPR-RL, a hybrid planning and learning approach that jointly produces high-quality demonstrations with only sparse rewards. We train behavior cloning policies on generated data, validate them in simulation, and deploy them directly on real robot hardware. The policies generalize to novel object poses, achieving 44% average success across a suite of real-world pick-and-place, drawer opening, contact-rich pushing, and long-horizon manipulation tasks. Our project website is at https://anytask.rai-inst.com .

AnyTask: an Automated Task and Data Generation Framework for Advancing Sim-to-Real Policy Learning

TL;DR

AnyTask tackles the robot data bottleneck by automating task design and data generation through a scalable AI-powered pipeline that leverages massively parallel simulation and foundation models. It introduces three agents—ViPR, ViPR-Eureka, and ViPR-RL—that generate diverse expert demonstrations using task–motion planning, dense rewards, and hybrid planning–RL strategies. The framework combines an object database, automated task/simulation generation, and dense trajectory annotations to produce high-quality synthetic data, enabling zero-shot sim-to-real policies that partially transfer to real robots. Empirical results show strong data generation efficiency, diverse task coverage, and notable, though evolving, real-world transfer performance, highlighting both the promise and current limitations of fully synthetic robotic training pipelines.

Abstract

Generalist robot learning remains constrained by data: large-scale, diverse, and high-quality interaction data are expensive to collect in the real world. While simulation has become a promising way for scaling up data collection, the related tasks, including simulation task design, task-aware scene generation, expert demonstration synthesis, and sim-to-real transfer, still demand substantial human effort. We present AnyTask, an automated framework that pairs massively parallel GPU simulation with foundation models to design diverse manipulation tasks and synthesize robot data. We introduce three AnyTask agents for generating expert demonstrations aiming to solve as many tasks as possible: 1) ViPR, a novel task and motion planning agent with VLM-in-the-loop Parallel Refinement; 2) ViPR-Eureka, a reinforcement learning agent with generated dense rewards and LLM-guided contact sampling; 3) ViPR-RL, a hybrid planning and learning approach that jointly produces high-quality demonstrations with only sparse rewards. We train behavior cloning policies on generated data, validate them in simulation, and deploy them directly on real robot hardware. The policies generalize to novel object poses, achieving 44% average success across a suite of real-world pick-and-place, drawer opening, contact-rich pushing, and long-horizon manipulation tasks. Our project website is at https://anytask.rai-inst.com .

Paper Structure

This paper contains 41 sections, 25 figures, 12 tables.

Figures (25)

  • Figure 1: AnyTask is a framework that automates task design and generates data for robot learning. The resulting data enables training visuomotor policies that can be deployed directly onto a physical robot without requiring any real-world data.
  • Figure 2: Overview of AnyTask. We first generate simulated manipulation tasks from an object database and a high-level task (i.e., task type). Then the pipeline automatically proposes task descriptions, generates the simulation code, and efficiently collects data using different agents, including ViPR, ViPR-RL, and ViPR-Eureka in massively parallel simulation environments. We apply online domain randomization in the simulation to ensure the diversity of the scenes and the visual observations. Finally, we train the policy using simulated data and zero-shot transfer to the real world.
  • Figure 3: ViPR improvement: Using ViPR leads to an average $12.8\%$ improvement in success rate on 301 tasks
  • Figure 4: Action replay enables faster data collection, especially on challenging tasks.
  • Figure 5: Zero-shot sim-to-real policy evaluations.
  • ...and 20 more figures