Table of Contents
Fetching ...

All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents

Zhiqiang Wang, Hao Zheng, Yunshuang Nie, Wenjun Xu, Qingwei Wang, Hua Ye, Zhe Li, Kaidong Zhang, Xuewen Cheng, Wanxi Dong, Chang Cai, Liang Lin, Feng Zheng, Xiaodan Liang

TL;DR

The paper addresses the fragmentation and modality gaps in embodied AI datasets by introducing ARIO, a standardized, timestamp-based data framework that unifies real, simulated, and transformed data across diverse robot morphologies. Building on this standard, the authors assemble a large-scale ARIO dataset (~3 million episodes, 258 series, 321,064 tasks) from real-world collection, multiple simulators, and open-source conversions, encompassing five modalities (images, 3D, sound, text, tactile) and a scene-series-task-episode structure. The key contributions are the ARIO standard, a comprehensive multi-modal dataset, and extensive statistics demonstrating broad coverage of scenes, skills, and robot configurations, enabling robust cross-embodiment learning and sim-to-real research. The work lays groundwork for scalable, generalizable embodied AI and invites further exploration into large-scale model training, richer modalities, and deeper sim-to-real alignment.

Abstract

Embodied AI is transforming how AI systems interact with the physical world, yet existing datasets are inadequate for developing versatile, general-purpose agents. These limitations include a lack of standardized formats, insufficient data diversity, and inadequate data volume. To address these issues, we introduce ARIO (All Robots In One), a new data standard that enhances existing datasets by offering a unified data format, comprehensive sensory modalities, and a combination of real-world and simulated data. ARIO aims to improve the training of embodied AI agents, increasing their robustness and adaptability across various tasks and environments. Building upon the proposed new standard, we present a large-scale unified ARIO dataset, comprising approximately 3 million episodes collected from 258 series and 321,064 tasks. The ARIO standard and dataset represent a significant step towards bridging the gaps of existing data resources. By providing a cohesive framework for data collection and representation, ARIO paves the way for the development of more powerful and versatile embodied AI agents, capable of navigating and interacting with the physical world in increasingly complex and diverse ways. The project is available on https://imaei.github.io/project_pages/ario/

All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents

TL;DR

The paper addresses the fragmentation and modality gaps in embodied AI datasets by introducing ARIO, a standardized, timestamp-based data framework that unifies real, simulated, and transformed data across diverse robot morphologies. Building on this standard, the authors assemble a large-scale ARIO dataset (~3 million episodes, 258 series, 321,064 tasks) from real-world collection, multiple simulators, and open-source conversions, encompassing five modalities (images, 3D, sound, text, tactile) and a scene-series-task-episode structure. The key contributions are the ARIO standard, a comprehensive multi-modal dataset, and extensive statistics demonstrating broad coverage of scenes, skills, and robot configurations, enabling robust cross-embodiment learning and sim-to-real research. The work lays groundwork for scalable, generalizable embodied AI and invites further exploration into large-scale model training, richer modalities, and deeper sim-to-real alignment.

Abstract

Embodied AI is transforming how AI systems interact with the physical world, yet existing datasets are inadequate for developing versatile, general-purpose agents. These limitations include a lack of standardized formats, insufficient data diversity, and inadequate data volume. To address these issues, we introduce ARIO (All Robots In One), a new data standard that enhances existing datasets by offering a unified data format, comprehensive sensory modalities, and a combination of real-world and simulated data. ARIO aims to improve the training of embodied AI agents, increasing their robustness and adaptability across various tasks and environments. Building upon the proposed new standard, we present a large-scale unified ARIO dataset, comprising approximately 3 million episodes collected from 258 series and 321,064 tasks. The ARIO standard and dataset represent a significant step towards bridging the gaps of existing data resources. By providing a cohesive framework for data collection and representation, ARIO paves the way for the development of more powerful and versatile embodied AI agents, capable of navigating and interacting with the physical world in increasingly complex and diverse ways. The project is available on https://imaei.github.io/project_pages/ario/
Paper Structure (11 sections, 8 figures, 2 tables)

This paper contains 11 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: All robots in one.
  • Figure 2: Challenges in embodied intelligence datasets: (a) Insufficient sensory modalities, lacking comprehensive data across multiple input types; (b) Lack of standardization, complicating data processing across diverse robotic forms; (c) Incompatibility across platforms, hindering unified control of various robot types; (d) The gap between simulation and reality, highlighting the need for integrated datasets bridging simulated and real-world data.
  • Figure 3: Collection pipeline of ARIO.
  • Figure 4: Illustration of the data collection platform which supports bimanual and whole-body teleoperation.
  • Figure 5: Some exmaple tasks, with the top row indicating the task category while the text at the botom row providing task instructions.
  • ...and 3 more figures