Table of Contents
Fetching ...

Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models

Leonard Bärmann, Rainer Kartmann, Fabian Peller-Konrad, Jan Niehues, Alex Waibel, Tamim Asfour

TL;DR

This paper proposes a system to achieve incremental learning of complex high-level behavior from natural interaction from natural interaction and demonstrates its implementation on a humanoid robot by demonstrating generalized incrementally learned knowledge.

Abstract

Natural-language dialog is key for intuitive human-robot interaction. It can be used not only to express humans' intents, but also to communicate instructions for improvement if a robot does not understand a command correctly. Of great importance is to endow robots with the ability to learn from such interaction experience in an incremental way to allow them to improve their behaviors or avoid mistakes in the future. In this paper, we propose a system to achieve incremental learning of complex behavior from natural interaction, and demonstrate its implementation on a humanoid robot. Building on recent advances, we present a system that deploys Large Language Models (LLMs) for high-level orchestration of the robot's behavior, based on the idea of enabling the LLM to generate Python statements in an interactive console to invoke both robot perception and action. The interaction loop is closed by feeding back human instructions, environment observations, and execution results to the LLM, thus informing the generation of the next statement. Specifically, we introduce incremental prompt learning, which enables the system to interactively learn from its mistakes. For that purpose, the LLM can call another LLM responsible for code-level improvements of the current interaction based on human feedback. The improved interaction is then saved in the robot's memory, and thus retrieved on similar requests. We integrate the system in the robot cognitive architecture of the humanoid robot ARMAR-6 and evaluate our methods both quantitatively (in simulation) and qualitatively (in simulation and real-world) by demonstrating generalized incrementally-learned knowledge.

Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models

TL;DR

This paper proposes a system to achieve incremental learning of complex high-level behavior from natural interaction from natural interaction and demonstrates its implementation on a humanoid robot by demonstrating generalized incrementally learned knowledge.

Abstract

Natural-language dialog is key for intuitive human-robot interaction. It can be used not only to express humans' intents, but also to communicate instructions for improvement if a robot does not understand a command correctly. Of great importance is to endow robots with the ability to learn from such interaction experience in an incremental way to allow them to improve their behaviors or avoid mistakes in the future. In this paper, we propose a system to achieve incremental learning of complex behavior from natural interaction, and demonstrate its implementation on a humanoid robot. Building on recent advances, we present a system that deploys Large Language Models (LLMs) for high-level orchestration of the robot's behavior, based on the idea of enabling the LLM to generate Python statements in an interactive console to invoke both robot perception and action. The interaction loop is closed by feeding back human instructions, environment observations, and execution results to the LLM, thus informing the generation of the next statement. Specifically, we introduce incremental prompt learning, which enables the system to interactively learn from its mistakes. For that purpose, the LLM can call another LLM responsible for code-level improvements of the current interaction based on human feedback. The improved interaction is then saved in the robot's memory, and thus retrieved on similar requests. We integrate the system in the robot cognitive architecture of the humanoid robot ARMAR-6 and evaluate our methods both quantitatively (in simulation) and qualitatively (in simulation and real-world) by demonstrating generalized incrementally-learned knowledge.
Paper Structure (31 sections, 5 equations, 9 figures, 5 tables)

This paper contains 31 sections, 5 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: ARMAR-6 incrementally learns behavior from natural interaction. Demonstration video at https://youtu.be/y5O2mRGtsLM
  • Figure 2: Comparison of Code as Policies liang_code_2023, HELPER sarch_open-ended_2023, DROC zha_distilling_2023 and our method, focusing on information flow from user input, observations, prompts, memories to LLM modules to robot execution, and how the methods learn from user interactions. Building on the interactive Python console prompting scheme, our method realizes incremental learning from natural interaction in a conceptually simple way.
  • Figure 3: Incremental learning of robot behavior from interaction
  • Figure 4: Conceptual view of our system. The robot's memory system Peller-Konrad2023MemorySystemRobot works as a mediator between the interaction manager and the robot system. The interaction LLM acts in a Python console environment. It can invoke functions to fetch the content of the current scene (as given by perception modules and stored in the memory) or invoke skills and thus perform robot actions. Relevant interaction examples are queried from the memory for few-shot prompting of the LLM. Incremental learning is performed by an improvement LLM updating the interaction examples memory with new content learned from instruction.
  • Figure 5: Overview of our method for incremental learning of robot behavior. We use an LLM (in our experiments, GPT-4 openai_gpt4_2023) to control robot perception and action given a prompt of few-shot examples (bottom, \ref{['sec:methods:python_console']}). Prompts are constructed dynamically based on the similarity to the current user request (top left, \ref{['sec:methods:prompting']}). The interaction examples memory is initialized with prior knowledge, and then incrementally enriched by LLM-improved problematic interactions to learn from mistakes (top right, \ref{['sec:methods:learning']}).
  • ...and 4 more figures