Table of Contents
Fetching ...

AnyCar to Anywhere: Learning Universal Dynamics Model for Agile and Adaptive Mobility

Wenli Xiao, Haoru Xue, Tony Tao, Dvij Kalaria, John M. Dolan, Guanya Shi

TL;DR

AnyCar presents a transformer-based universal dynamics model that generalizes across diverse wheeled robot configurations while achieving agile control via in-context adaptation. It combines massive simulated data from multiple physics backends with a robust training regime and a brief real-world fine-tuning phase, then deploys with a sampling-based MPPI controller to achieve 50 Hz control. Empirical results show strong few-shot and zero-shot generalization, with up to 54% performance gains over specialist baselines and demonstrated resilience to state-estimation errors in indoor and outdoor environments. This work advances toward a foundation model for agile wheeled robot control and provides an open-source framework to facilitate further research.

Abstract

Recent works in the robot learning community have successfully introduced generalist models capable of controlling various robot embodiments across a wide range of tasks, such as navigation and locomotion. However, achieving agile control, which pushes the limits of robotic performance, still relies on specialist models that require extensive parameter tuning. To leverage generalist-model adaptability and flexibility while achieving specialist-level agility, we propose AnyCar, a transformer-based generalist dynamics model designed for agile control of various wheeled robots. To collect training data, we unify multiple simulators and leverage different physics backends to simulate vehicles with diverse sizes, scales, and physical properties across various terrains. With robust training and real-world fine-tuning, our model enables precise adaptation to different vehicles, even in the wild and under large state estimation errors. In real-world experiments, AnyCar shows both few-shot and zero-shot generalization across a wide range of vehicles and environments, where our model, combined with a sampling-based MPC, outperforms specialist models by up to 54%. These results represent a key step toward building a foundation model for agile wheeled robot control. We will also open-source our framework to support further research.

AnyCar to Anywhere: Learning Universal Dynamics Model for Agile and Adaptive Mobility

TL;DR

AnyCar presents a transformer-based universal dynamics model that generalizes across diverse wheeled robot configurations while achieving agile control via in-context adaptation. It combines massive simulated data from multiple physics backends with a robust training regime and a brief real-world fine-tuning phase, then deploys with a sampling-based MPPI controller to achieve 50 Hz control. Empirical results show strong few-shot and zero-shot generalization, with up to 54% performance gains over specialist baselines and demonstrated resilience to state-estimation errors in indoor and outdoor environments. This work advances toward a foundation model for agile wheeled robot control and provides an open-source framework to facilitate further research.

Abstract

Recent works in the robot learning community have successfully introduced generalist models capable of controlling various robot embodiments across a wide range of tasks, such as navigation and locomotion. However, achieving agile control, which pushes the limits of robotic performance, still relies on specialist models that require extensive parameter tuning. To leverage generalist-model adaptability and flexibility while achieving specialist-level agility, we propose AnyCar, a transformer-based generalist dynamics model designed for agile control of various wheeled robots. To collect training data, we unify multiple simulators and leverage different physics backends to simulate vehicles with diverse sizes, scales, and physical properties across various terrains. With robust training and real-world fine-tuning, our model enables precise adaptation to different vehicles, even in the wild and under large state estimation errors. In real-world experiments, AnyCar shows both few-shot and zero-shot generalization across a wide range of vehicles and environments, where our model, combined with a sampling-based MPC, outperforms specialist models by up to 54%. These results represent a key step toward building a foundation model for agile wheeled robot control. We will also open-source our framework to support further research.
Paper Structure (22 sections, 3 equations, 6 figures, 2 tables)

This paper contains 22 sections, 3 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Performance of AnyCar and baselines in the wild under state estimation errors. Above: A 10 cm tolerance corridor is set as a checkpoint. Below: each row represents the true trajectory of one method, and each column corresponds to a specific setting for the 1/16 scale car: high speed (2 m/s), towing a box, and replacing the front left tire with a plastic wheel. All settings significantly alter the vehicle dynamics.
  • Figure 2: AnyCar System Pipeline. Phase 1: We collect 100M data in 4 different simulations for pre-training and 0.02M few-shot real-world data for fine-tuning the model. Phase 2: We pre-train the model with the simulation dataset and enhance prediction robustness through masking, adding noise, and attacking the inputs. We also fine-tune using the fine-tuning dataset. Phase 3: We deploy AnyCar in the wild under state estimation error (using SLAM macenski_slam_2021macenski_marathon_2020fox_kld-sampling_2001 and VIO stereolabs_stereolabszed-ros2-wrapper_2024) to control different vehicles (1/10 scale, 1/16 scale) with different settings (tow object, 3D-printed wheels) on different terrains.
  • Figure 3: Comparison of different model structures and data scales. The reported testing error is normalized using the mean and standard deviation of the evaluation dataset.
  • Figure 4: Visualization of AnyCar's transformer attention in three real-world settings: (a) low speed at 0.5 m/s, (b) high speed at 2 m/s, and (c) towing an object at 2 m/s, all tracking the same reference trajectory. AnyCar's transformer consistently focuses on the nearest 50 steps across all settings and adaptively attends to different sections of the track. For example, it attends to the first corner in setting (b) and the second corner in setting (c).
  • Figure 5: Comparison with baselines in the wild.
  • ...and 1 more figures