Graph Constrained Reinforcement Learning for Natural Language Action Spaces
Prithviraj Ammanabrolu, Matthew Hausknecht
TL;DR
The paper tackles reinforcement learning in text-based Interactive Fiction with extremely large natural-language action spaces. It introduces KG-A2C, which uses a dynamic knowledge graph to represent state and a template-based action space constrained by a graph mask to enable scalable exploration. Through extensive Jericho-based experiments, KG-A2C achieves strong performance and generalizes across diverse games, outperforming a strong template-based baseline in many cases. Ablation analyses confirm the critical roles of the knowledge graph, graph attention encoding, and valid-action supervision in language-grounded RL.
Abstract
Interactive Fiction games are text-based simulations in which an agent interacts with the world purely through natural language. They are ideal environments for studying how to extend reinforcement learning agents to meet the challenges of natural language understanding, partial observability, and action generation in combinatorially-large text-based action spaces. We present KG-A2C, an agent that builds a dynamic knowledge graph while exploring and generates actions using a template-based action space. We contend that the dual uses of the knowledge graph to reason about game state and to constrain natural language generation are the keys to scalable exploration of combinatorially large natural language actions. Results across a wide variety of IF games show that KG-A2C outperforms current IF agents despite the exponential increase in action space size.
