Table of Contents
Fetching ...

EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning

Kinjal Basu, Keerthiram Murugesan, Subhajit Chaudhury, Murray Campbell, Kartik Talamadupula, Tim Klinger

TL;DR

EXPLORER is an exploration-guided reasoning agent for textual reinforcement learning that is neuro-symbolic in nature, as it relies on a neural module for exploration and a symbolic module for exploitation and can also learn generalized symbolic policies and perform well over unseen data.

Abstract

Text-based games (TBGs) have emerged as an important collection of NLP tasks, requiring reinforcement learning (RL) agents to combine natural language understanding with reasoning. A key challenge for agents attempting to solve such tasks is to generalize across multiple games and demonstrate good performance on both seen and unseen objects. Purely deep-RL-based approaches may perform well on seen objects; however, they fail to showcase the same performance on unseen objects. Commonsense-infused deep-RL agents may work better on unseen data; unfortunately, their policies are often not interpretable or easily transferable. To tackle these issues, in this paper, we present EXPLORER which is an exploration-guided reasoning agent for textual reinforcement learning. EXPLORER is neurosymbolic in nature, as it relies on a neural module for exploration and a symbolic module for exploitation. It can also learn generalized symbolic policies and perform well over unseen data. Our experiments show that EXPLORER outperforms the baseline agents on Text-World cooking (TW-Cooking) and Text-World Commonsense (TWC) games.

EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning

TL;DR

EXPLORER is an exploration-guided reasoning agent for textual reinforcement learning that is neuro-symbolic in nature, as it relies on a neural module for exploration and a symbolic module for exploitation and can also learn generalized symbolic policies and perform well over unseen data.

Abstract

Text-based games (TBGs) have emerged as an important collection of NLP tasks, requiring reinforcement learning (RL) agents to combine natural language understanding with reasoning. A key challenge for agents attempting to solve such tasks is to generalize across multiple games and demonstrate good performance on both seen and unseen objects. Purely deep-RL-based approaches may perform well on seen objects; however, they fail to showcase the same performance on unseen objects. Commonsense-infused deep-RL agents may work better on unseen data; unfortunately, their policies are often not interpretable or easily transferable. To tackle these issues, in this paper, we present EXPLORER which is an exploration-guided reasoning agent for textual reinforcement learning. EXPLORER is neurosymbolic in nature, as it relies on a neural module for exploration and a symbolic module for exploitation. It can also learn generalized symbolic policies and perform well over unseen data. Our experiments show that EXPLORER outperforms the baseline agents on Text-World cooking (TW-Cooking) and Text-World Commonsense (TWC) games.
Paper Structure (13 sections, 2 equations, 8 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 2 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: An overview of the EXPLORER agent's dataflow on a TWC game. In EXPLORER, the neural module is responsible for exploration and collects <action, state, reward> pairs, whereas the symbolic module learns the rules and does the exploitation using commonsense knowledge from WordNet.
  • Figure 2: Overview of EXPLORER's decision-making at any given time step. The Hybrid Neuro-Symbolic architecture mainly consists of 5 modules - (a) Context Encoder encodes the observation to dynamic context, (b) Action Encoder encodes the admissible actions, (c) Neural Action Selector combines (a) and (b) with $\bigoplus$ operator, (d) Symbolic Action Selector returns a set of candidate actions, and (e) Symbolic Rule Learner uses ILP and WordNet-based rule generalization to generate symbolic rules.
  • Figure 3: Entity extraction using Action Template
  • Figure 4: ILP Rule Learning Example
  • Figure 5: Example of Rule Generalization
  • ...and 3 more figures