Table of Contents
Fetching ...

Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use

Jiajun Xi, Yinong He, Jianing Yang, Yinpei Dai, Joyce Chai

TL;DR

This paper examines how different levels of language informativeness and diversity impact agent learning and inference and demonstrates that agents trained with diverse and informative language feedback can achieve enhanced generalization and fast adaptation to new tasks.

Abstract

In real-world scenarios, it is desirable for embodied agents to have the ability to leverage human language to gain explicit or implicit knowledge for learning tasks. Despite recent progress, most previous approaches adopt simple low-level instructions as language inputs, which may not reflect natural human communication. It's not clear how to incorporate rich language use to facilitate task learning. To address this question, this paper studies different types of language inputs in facilitating reinforcement learning (RL) embodied agents. More specifically, we examine how different levels of language informativeness (i.e., feedback on past behaviors and future guidance) and diversity (i.e., variation of language expressions) impact agent learning and inference. Our empirical results based on four RL benchmarks demonstrate that agents trained with diverse and informative language feedback can achieve enhanced generalization and fast adaptation to new tasks. These findings highlight the pivotal role of language use in teaching embodied agents new tasks in an open world. Project website: https://github.com/sled-group/Teachable_RL

Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use

TL;DR

This paper examines how different levels of language informativeness and diversity impact agent learning and inference and demonstrates that agents trained with diverse and informative language feedback can achieve enhanced generalization and fast adaptation to new tasks.

Abstract

In real-world scenarios, it is desirable for embodied agents to have the ability to leverage human language to gain explicit or implicit knowledge for learning tasks. Despite recent progress, most previous approaches adopt simple low-level instructions as language inputs, which may not reflect natural human communication. It's not clear how to incorporate rich language use to facilitate task learning. To address this question, this paper studies different types of language inputs in facilitating reinforcement learning (RL) embodied agents. More specifically, we examine how different levels of language informativeness (i.e., feedback on past behaviors and future guidance) and diversity (i.e., variation of language expressions) impact agent learning and inference. Our empirical results based on four RL benchmarks demonstrate that agents trained with diverse and informative language feedback can achieve enhanced generalization and fast adaptation to new tasks. These findings highlight the pivotal role of language use in teaching embodied agents new tasks in an open world. Project website: https://github.com/sled-group/Teachable_RL

Paper Structure

This paper contains 42 sections, 10 figures, 8 tables, 1 algorithm.

Figures (10)

  • Figure 1: An overview of four environments used for experiments. It shows tasks to be learned in each environment; examples of hindsight (marked H) and foresight (F) language feedback (next to the gear icon are hand-crafted templates and next to the GPT icon are GPT-4 generated feedback); as well as low-level actions in each environment.
  • Figure 2: A demonstration of hindsight and foresight language feedback generation. In our framework, the agent $\pi$ executes the trajectory, while the expert agent $\pi^{*}$, with access to privileged ground truth knowledge, is used solely to provide information for generating language feedback to $\pi$. At time step $t$, hindsight language is generated by comparing the agent's action $a_{t-1}$ with the expert agent's action $a_{t-1}^*$, whereas foresight language is generated by referring to the expert agent's action $a_{t}^*$ to guide the agent on the next step. To increase the diversity of language feedback, we construct a pool of language templates comprising GPT-augmented languages, and sample candidate instructions as online language feedback.
  • Figure 3: Language-Teachable Decision Transformer.
  • Figure 4: Comparison of agent performance in four environments (averaged across 100 seeds in each environment) under varying levels of language feedback informativeness and diversity. Agents trained with more informative language feedback exhibit progressively higher performance. Furthermore, given the same informativeness (Hindsight + Foresight), increasing diversity with the GPT-augmented language pool leads to the highest performance.
  • Figure 5: Comparison of agent performance on unseen tasks in four environments (averaged across 100 seeds in each environment) under varying language informativeness in agent pre-training. Agent trained with more informative language adapts to new tasks faster and better.
  • ...and 5 more figures