Table of Contents
Fetching ...

Exploring GPT-4 for Robotic Agent Strategy with Real-Time State Feedback and a Reactive Behaviour Framework

Thomas O'Brien, Ysobel Sims

TL;DR

The paper addresses enabling a humanoid robot to interpret high-level user goals via an LLM and execute via a reactive, tree-based behavior framework. It integrates GPT-4 as a LLM Provider within the Director framework and NUClear real-time messaging to produce rolling task plans with safety guarantees. Experiments in Webots simulation and on a NUgus platform show that the approach achieves a majority of goals with smooth transitions, while highlighting issues in perception feedback and the cost of online LLM deployment. The work suggests that combining reactive behavior trees with LLMs offers a practical path toward adaptable, safety-conscious robotic agents, with future work focusing on broader tasks, improved localization, and local LLM deployment.

Abstract

We explore the use of GPT-4 on a humanoid robot in simulation and the real world as proof of concept of a novel large language model (LLM) driven behaviour method. LLMs have shown the ability to perform various tasks, including robotic agent behaviour. The problem involves prompting the LLM with a goal, and the LLM outputs the sub-tasks to complete to achieve that goal. Previous works focus on the executability and correctness of the LLM's generated tasks. We propose a method that successfully addresses practical concerns around safety, transitions between tasks, time horizons of tasks and state feedback. In our experiments we have found that our approach produces output for feasible requests that can be executed every time, with smooth transitions. User requests are achieved most of the time across a range of goal time horizons.

Exploring GPT-4 for Robotic Agent Strategy with Real-Time State Feedback and a Reactive Behaviour Framework

TL;DR

The paper addresses enabling a humanoid robot to interpret high-level user goals via an LLM and execute via a reactive, tree-based behavior framework. It integrates GPT-4 as a LLM Provider within the Director framework and NUClear real-time messaging to produce rolling task plans with safety guarantees. Experiments in Webots simulation and on a NUgus platform show that the approach achieves a majority of goals with smooth transitions, while highlighting issues in perception feedback and the cost of online LLM deployment. The work suggests that combining reactive behavior trees with LLMs offers a practical path toward adaptable, safety-conscious robotic agents, with future work focusing on broader tasks, improved localization, and local LLM deployment.

Abstract

We explore the use of GPT-4 on a humanoid robot in simulation and the real world as proof of concept of a novel large language model (LLM) driven behaviour method. LLMs have shown the ability to perform various tasks, including robotic agent behaviour. The problem involves prompting the LLM with a goal, and the LLM outputs the sub-tasks to complete to achieve that goal. Previous works focus on the executability and correctness of the LLM's generated tasks. We propose a method that successfully addresses practical concerns around safety, transitions between tasks, time horizons of tasks and state feedback. In our experiments we have found that our approach produces output for feasible requests that can be executed every time, with smooth transitions. User requests are achieved most of the time across a range of goal time horizons.

Paper Structure

This paper contains 15 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Example of a Director tree for walking to and kicking a soccer ball director2023
  • Figure 2: Overview of method with flow of data and control between user request, the system and the robot. The user request and state information is used to create the LLM prompt. The LLM outputs tasks to the Director system, which creates joint commands at the actuation level to move the robot. Safety module/s may take over the actuation level to ensure safety compliance.
  • Figure 3: NUgus platform in simulation environment Webots (left) and real hardware (right).
  • Figure 4: Experiment setup in Webots simulator. Robot is placed facing away from the ball to encourage the use of world information to achieve requests.