Table of Contents
Fetching ...

GNM: A General Navigation Model to Drive Any Robot

Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

TL;DR

GNM demonstrates that a single, goal-conditioned navigation model trained on heterogeneous data from multiple robots can generalize to unseen platforms and environments. By standardizing a shared action space and conditioning on an embodiment context derived from past observations, the omnipolicy achieves robust zero-shot deployment and outperforms single-domain policies. The work highlights the value of cross-robot data aggregation and systematic design-space analysis, suggesting a path toward universal pre-trained navigation backbones for robotics. The approach is validated with four unseen robots and shows resilience to sensor and actuation degradation, marking a step toward scalable, transferable vision-based navigation.

Abstract

Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data. If we could combine data from all available sources, including multiple kinds of robots, we could train more powerful navigation models. In this paper, we study how a general goal-conditioned model for vision-based navigation can be trained on data obtained from many distinct but structurally similar robots, and enable broad generalization across environments and embodiments. We analyze the necessary design decisions for effective data sharing across robots, including the use of temporal context and standardized action spaces, and demonstrate that an omnipolicy trained from heterogeneous datasets outperforms policies trained on any single dataset. We curate 60 hours of navigation trajectories from 6 distinct robots, and deploy the trained GNM on a range of new robots, including an underactuated quadrotor. We find that training on diverse data leads to robustness against degradation in sensing and actuation. Using a pre-trained navigation model with broad generalization capabilities can bootstrap applications on novel robots going forward, and we hope that the GNM represents a step in that direction. For more information on the datasets, code, and videos, please check out our project page https://sites.google.com/view/drive-any-robot.

GNM: A General Navigation Model to Drive Any Robot

TL;DR

GNM demonstrates that a single, goal-conditioned navigation model trained on heterogeneous data from multiple robots can generalize to unseen platforms and environments. By standardizing a shared action space and conditioning on an embodiment context derived from past observations, the omnipolicy achieves robust zero-shot deployment and outperforms single-domain policies. The work highlights the value of cross-robot data aggregation and systematic design-space analysis, suggesting a path toward universal pre-trained navigation backbones for robotics. The approach is validated with four unseen robots and shows resilience to sensor and actuation degradation, marking a step toward scalable, transferable vision-based navigation.

Abstract

Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data. If we could combine data from all available sources, including multiple kinds of robots, we could train more powerful navigation models. In this paper, we study how a general goal-conditioned model for vision-based navigation can be trained on data obtained from many distinct but structurally similar robots, and enable broad generalization across environments and embodiments. We analyze the necessary design decisions for effective data sharing across robots, including the use of temporal context and standardized action spaces, and demonstrate that an omnipolicy trained from heterogeneous datasets outperforms policies trained on any single dataset. We curate 60 hours of navigation trajectories from 6 distinct robots, and deploy the trained GNM on a range of new robots, including an underactuated quadrotor. We find that training on diverse data leads to robustness against degradation in sensing and actuation. Using a pre-trained navigation model with broad generalization capabilities can bootstrap applications on novel robots going forward, and we hope that the GNM represents a step in that direction. For more information on the datasets, code, and videos, please check out our project page https://sites.google.com/view/drive-any-robot.
Paper Structure (16 sections, 6 figures, 5 tables)

This paper contains 16 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: A general navigation model to drive any robot. By training on diverse, heterogeneous datasets, a single "omnipolicy" can control a variety of robots in challenging environments, including new robots, without any robot-specific data collection.
  • Figure 2: GNM architecture. We modify a typical goal-conditioned architecture (purple) by conditioning it on additional context from the target robot (pink) and making predictions in a shared, normalized action space (yellow).
  • Figure 3: Depoying the GNM omnipolicy. We evaluate on 4 different robots in challenging indoor and outdoor environments.
  • Figure 4: Qualitative comparison. Policies trained with increasingly diverse data lead to better generalization to a LoCoBot (top) and Jackal (bottom). Both robots were controlled by the same policy.
  • Figure 5: Policies trained with GNM are more robust to degradation in parameters such as (a) actuation, (b) perturbed sensor viewpoint, and (c) physical damage, than single-domain policies (d).
  • ...and 1 more figures