Table of Contents
Fetching ...

Graph Constrained Reinforcement Learning for Natural Language Action Spaces

Prithviraj Ammanabrolu, Matthew Hausknecht

TL;DR

The paper tackles reinforcement learning in text-based Interactive Fiction with extremely large natural-language action spaces. It introduces KG-A2C, which uses a dynamic knowledge graph to represent state and a template-based action space constrained by a graph mask to enable scalable exploration. Through extensive Jericho-based experiments, KG-A2C achieves strong performance and generalizes across diverse games, outperforming a strong template-based baseline in many cases. Ablation analyses confirm the critical roles of the knowledge graph, graph attention encoding, and valid-action supervision in language-grounded RL.

Abstract

Interactive Fiction games are text-based simulations in which an agent interacts with the world purely through natural language. They are ideal environments for studying how to extend reinforcement learning agents to meet the challenges of natural language understanding, partial observability, and action generation in combinatorially-large text-based action spaces. We present KG-A2C, an agent that builds a dynamic knowledge graph while exploring and generates actions using a template-based action space. We contend that the dual uses of the knowledge graph to reason about game state and to constrain natural language generation are the keys to scalable exploration of combinatorially large natural language actions. Results across a wide variety of IF games show that KG-A2C outperforms current IF agents despite the exponential increase in action space size.

Graph Constrained Reinforcement Learning for Natural Language Action Spaces

TL;DR

The paper tackles reinforcement learning in text-based Interactive Fiction with extremely large natural-language action spaces. It introduces KG-A2C, which uses a dynamic knowledge graph to represent state and a template-based action space constrained by a graph mask to enable scalable exploration. Through extensive Jericho-based experiments, KG-A2C achieves strong performance and generalizes across diverse games, outperforming a strong template-based baseline in many cases. Ablation analyses confirm the critical roles of the knowledge graph, graph attention encoding, and valid-action supervision in language-grounded RL.

Abstract

Interactive Fiction games are text-based simulations in which an agent interacts with the world purely through natural language. They are ideal environments for studying how to extend reinforcement learning agents to meet the challenges of natural language understanding, partial observability, and action generation in combinatorially-large text-based action spaces. We present KG-A2C, an agent that builds a dynamic knowledge graph while exploring and generates actions using a template-based action space. We contend that the dual uses of the knowledge graph to reason about game state and to constrain natural language generation are the keys to scalable exploration of combinatorially large natural language actions. Results across a wide variety of IF games show that KG-A2C outperforms current IF agents despite the exponential increase in action space size.

Paper Structure

This paper contains 15 sections, 9 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The full KG-A2C architecture. Solid lines represent computation flow along which the gradient can be back-propagated.
  • Figure 2: An overall example of the knowledge graph building and subsequent action decoding process for a given state in Zork1, illustrating the use of interactive objects and the graph mask.
  • Figure 3: Ablation results on Zork1, averaged across 5 independent runs.
  • Figure 4: Learning curves for KGA2C-full. Shaded regions indicate standard deviations.
  • Figure 5: A map of the world of Zork1 with some initial rewards annotated. The blue arrow indicates a connection between the left and right maps, corresponding to the overworld and the dungeon.