ChatGPT for Robotics: Design Principles and Model Abilities
Sai Vemprala, Rogerio Bonatti, Arthur Bucker, Ashish Kapoor
TL;DR
This work investigates using prompt-driven, language-model–inspired strategies for robotics by combining prompt design with a high-level function library to adapt to diverse tasks, simulators, and form factors. It proposes an approach that uses structured prompts and a modular tokenizer/representation toolkit to enable a ChatGPT-like model to solve a range of robotics tasks, from reasoning to navigation, demonstrated on MuSHR and Habitat data. The study analyzes tokenization schemes, transformer sequence length, and the impact of model size on real-time performance, and introduces PromptCraft as an open-source platform with a robotics simulator for prompting research. The results highlight the potential of language-model-based interfaces in robotics while emphasizing latency- and data-efficiency trade-offs essential for real-time control.
Abstract
This paper presents an experimental study regarding the use of OpenAI's ChatGPT for robotics applications. We outline a strategy that combines design principles for prompt engineering and the creation of a high-level function library which allows ChatGPT to adapt to different robotics tasks, simulators, and form factors. We focus our evaluations on the effectiveness of different prompt engineering techniques and dialog strategies towards the execution of various types of robotics tasks. We explore ChatGPT's ability to use free-form dialog, parse XML tags, and to synthesize code, in addition to the use of task-specific prompting functions and closed-loop reasoning through dialogues. Our study encompasses a range of tasks within the robotics domain, from basic logical, geometrical, and mathematical reasoning all the way to complex domains such as aerial navigation, manipulation, and embodied agents. We show that ChatGPT can be effective at solving several of such tasks, while allowing users to interact with it primarily via natural language instructions. In addition to these studies, we introduce an open-sourced research tool called PromptCraft, which contains a platform where researchers can collaboratively upload and vote on examples of good prompting schemes for robotics applications, as well as a sample robotics simulator with ChatGPT integration, making it easier for users to get started with using ChatGPT for robotics.
