Table of Contents
Fetching ...

QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds

Yuting Mei, Ye Wang, Sipeng Zheng, Qin Jin

TL;DR

QuadrupedGPT tackles the challenge of building a quadruped agent that can autonomously navigate open-ended environments with pet-like agility and human-like cognition. It integrates automatic locomotion adaptation via the Location-Simulation-Selection strategy, semantic-cost based local path planning, and LMM-driven high-level reasoning to decompose long-horizon goals into executable subgoals. The approach is validated through simulation and real-world experiments, showing improvements over manual tuning and fixed gait baselines, and demonstrating safer, stance-aware navigation in varied terrains. This work advances practical, general-purpose quadruped autonomy with open-world capabilities and highlights areas for improved perception robustness and transfer to diverse contexts.

Abstract

As robotic agents increasingly assist humans in reality, quadruped robots offer unique opportunities for interaction in complex scenarios due to their agile movement. However, building agents that can autonomously navigate, adapt, and respond to versatile goals remains a significant challenge. In this work, we introduce QuadrupedGPT designed to follow diverse commands with agility comparable to that of a pet. The primary challenges addressed include: i) effectively utilizing multimodal observations for informed decision-making; ii) achieving agile control by integrating locomotion and navigation; iii) developing advanced cognition to execute long-term objectives. Our QuadrupedGPT interprets human commands and environmental contexts using a large multimodal model. Leveraging its extensive knowledge base, the agent autonomously assigns parameters for adaptive locomotion policies and devises safe yet efficient paths toward its goals. Additionally, it employs high-level reasoning to decompose long-term goals into a sequence of executable subgoals. Through comprehensive experiments, our agent shows proficiency in handling diverse tasks and intricate instructions, representing a significant step toward the development of versatile quadruped agents for open-ended environments.

QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds

TL;DR

QuadrupedGPT tackles the challenge of building a quadruped agent that can autonomously navigate open-ended environments with pet-like agility and human-like cognition. It integrates automatic locomotion adaptation via the Location-Simulation-Selection strategy, semantic-cost based local path planning, and LMM-driven high-level reasoning to decompose long-horizon goals into executable subgoals. The approach is validated through simulation and real-world experiments, showing improvements over manual tuning and fixed gait baselines, and demonstrating safer, stance-aware navigation in varied terrains. This work advances practical, general-purpose quadruped autonomy with open-world capabilities and highlights areas for improved perception robustness and transfer to diverse contexts.

Abstract

As robotic agents increasingly assist humans in reality, quadruped robots offer unique opportunities for interaction in complex scenarios due to their agile movement. However, building agents that can autonomously navigate, adapt, and respond to versatile goals remains a significant challenge. In this work, we introduce QuadrupedGPT designed to follow diverse commands with agility comparable to that of a pet. The primary challenges addressed include: i) effectively utilizing multimodal observations for informed decision-making; ii) achieving agile control by integrating locomotion and navigation; iii) developing advanced cognition to execute long-term objectives. Our QuadrupedGPT interprets human commands and environmental contexts using a large multimodal model. Leveraging its extensive knowledge base, the agent autonomously assigns parameters for adaptive locomotion policies and devises safe yet efficient paths toward its goals. Additionally, it employs high-level reasoning to decompose long-term goals into a sequence of executable subgoals. Through comprehensive experiments, our agent shows proficiency in handling diverse tasks and intricate instructions, representing a significant step toward the development of versatile quadruped agents for open-ended environments.
Paper Structure (26 sections, 2 equations, 10 figures, 6 tables)

This paper contains 26 sections, 2 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Built upon the cutting-edge large multimodal model, Quadruped-GPT aims to develop a versatile quadruped agent with the agility of four-legged pets while being able to comprehend intricate human commands and complete them safely and efficiently in open-world environments.
  • Figure 2: The overview of our automatic locomotion adaption strategy "Location-Simulation-Selection" (LSS).
  • Figure 3: Example prompt to guide quadruped agents across uneven terrains.
  • Figure 4: Illustration of 2D open-vocabulary cost map generation.
  • Figure 5: Ablation results of different parameter selection alternatives for LSS strategy. Average $r_{v_{x, y}^{cmd}}$ is reported.
  • ...and 5 more figures