Table of Contents
Fetching ...

LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion

Firas Al-Hafez, Guoping Zhao, Jan Peters, Davide Tateo

TL;DR

The paper addresses the absence of standardized, realistic benchmarks for imitation learning in locomotion. It introduces LocoMuJoCo, a versatile benchmark with 12 environments and 27 tasks across humanoid, quadruped, and musculoskeletal models, paired with diverse datasets that include real motion capture and expert/sub-optimal demonstrations, and supports dynamics randomization and partial observability. It provides a practical training and evaluation pipeline with Gymnasium and Mushroom-RL integrations, along with baseline algorithms to enable rigorous comparisons. The work aims to accelerate progress in locomotion IL by enabling reproducible evaluation, easier sim-to-real transfer, and support for multiple data paradigms and embodiment scenarios.

Abstract

Imitation Learning (IL) holds great promise for enabling agile locomotion in embodied agents. However, many existing locomotion benchmarks primarily focus on simplified toy tasks, often failing to capture the complexity of real-world scenarios and steering research toward unrealistic domains. To advance research in IL for locomotion, we present a novel benchmark designed to facilitate rigorous evaluation and comparison of IL algorithms. This benchmark encompasses a diverse set of environments, including quadrupeds, bipeds, and musculoskeletal human models, each accompanied by comprehensive datasets, such as real noisy motion capture data, ground truth expert data, and ground truth sub-optimal data, enabling evaluation across a spectrum of difficulty levels. To increase the robustness of learned agents, we provide an easy interface for dynamics randomization and offer a wide range of partially observable tasks to train agents across different embodiments. Finally, we provide handcrafted metrics for each task and ship our benchmark with state-of-the-art baseline algorithms to ease evaluation and enable fast benchmarking.

LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion

TL;DR

The paper addresses the absence of standardized, realistic benchmarks for imitation learning in locomotion. It introduces LocoMuJoCo, a versatile benchmark with 12 environments and 27 tasks across humanoid, quadruped, and musculoskeletal models, paired with diverse datasets that include real motion capture and expert/sub-optimal demonstrations, and supports dynamics randomization and partial observability. It provides a practical training and evaluation pipeline with Gymnasium and Mushroom-RL integrations, along with baseline algorithms to enable rigorous comparisons. The work aims to accelerate progress in locomotion IL by enabling reproducible evaluation, easier sim-to-real transfer, and support for multiple data paradigms and embodiment scenarios.

Abstract

Imitation Learning (IL) holds great promise for enabling agile locomotion in embodied agents. However, many existing locomotion benchmarks primarily focus on simplified toy tasks, often failing to capture the complexity of real-world scenarios and steering research toward unrealistic domains. To advance research in IL for locomotion, we present a novel benchmark designed to facilitate rigorous evaluation and comparison of IL algorithms. This benchmark encompasses a diverse set of environments, including quadrupeds, bipeds, and musculoskeletal human models, each accompanied by comprehensive datasets, such as real noisy motion capture data, ground truth expert data, and ground truth sub-optimal data, enabling evaluation across a spectrum of difficulty levels. To increase the robustness of learned agents, we provide an easy interface for dynamics randomization and offer a wide range of partially observable tasks to train agents across different embodiments. Finally, we provide handcrafted metrics for each task and ship our benchmark with state-of-the-art baseline algorithms to ease evaluation and enable fast benchmarking.
Paper Structure (6 sections, 2 figures)

This paper contains 6 sections, 2 figures.

Figures (2)

  • Figure 1: Overview of environments. Each task is defined by a certain dataset in an environment, e.g., the Talos carry or the muscle humanoid running task. Currently, LocoMuJoCo encompasses 12 environments with a total of 27 tasks.
  • Figure 2: Training pipeline of LocoMuJoCo. First, an environment is chosen. Then, a task and dataset are chosen, and the training is started optionally with dynamics randomization. Finally, the performance of the algorithm can be compared to the expert performance or the performance of one of the provided baseline algorithms.