Table of Contents
Fetching ...

Goal Space Abstraction in Hierarchical Reinforcement Learning via Set-Based Reachability Analysis

Mehdi Zadem, Sergio Mover, Sao Mai Nguyen

TL;DR

Problem: learning symbolic goal representations for hierarchical RL in continuous state spaces with sparse rewards. Approach: GARA, a two-level Feudal HRL that concurrently learns a discrete goal space $\mathcal{G}$ by set-based reachability analysis using a k-forward model $\mathcal{F}_k$ to approximate $R_k$. Contributions include automatic discovery of an interpretable, transferable subgoal abstraction as state subsets, refinement via reachability analysis, and evidence of data efficiency and transfer across maze tasks. Significance: enables planning with a symbolic graph of subgoals without manual goal design and improves sample efficiency in open-ended, high-dimensional settings.

Abstract

Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic goal representation is that it must preserve critical information, such as the environment dynamics. In this paper, we propose a developmental mechanism for goal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states that have similar roles in the task. We introduce a Feudal HRL algorithm that concurrently learns both the goal representation and a hierarchical policy. The algorithm uses symbolic reachability analysis for neural networks to approximate the transition relation among sets of states and to refine the goal representation. We evaluate our approach on complex navigation tasks, showing the learned representation is interpretable, transferrable and results in data efficient learning.

Goal Space Abstraction in Hierarchical Reinforcement Learning via Set-Based Reachability Analysis

TL;DR

Problem: learning symbolic goal representations for hierarchical RL in continuous state spaces with sparse rewards. Approach: GARA, a two-level Feudal HRL that concurrently learns a discrete goal space by set-based reachability analysis using a k-forward model to approximate . Contributions include automatic discovery of an interpretable, transferable subgoal abstraction as state subsets, refinement via reachability analysis, and evidence of data efficiency and transfer across maze tasks. Significance: enables planning with a symbolic graph of subgoals without manual goal design and improves sample efficiency in open-ended, high-dimensional settings.

Abstract

Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic goal representation is that it must preserve critical information, such as the environment dynamics. In this paper, we propose a developmental mechanism for goal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states that have similar roles in the task. We introduce a Feudal HRL algorithm that concurrently learns both the goal representation and a hierarchical policy. The algorithm uses symbolic reachability analysis for neural networks to approximate the transition relation among sets of states and to refine the goal representation. We evaluate our approach on complex navigation tasks, showing the learned representation is interpretable, transferrable and results in data efficient learning.
Paper Structure (17 sections, 1 equation, 5 figures, 3 algorithms)

This paper contains 17 sections, 1 equation, 5 figures, 3 algorithms.

Figures (5)

  • Figure 1: SplitSet splits set $G$ in \ref{['fig:splitpartition:1']}, concludes that $G$ reaches $G_d$ in \ref{['fig:splitpartition:2']}, or that $G$ does not reach $G_d$ in \ref{['fig:splitpartition:4']}.
  • Figure 2: Representation of the goal space $\mathcal{G}$ in the U-shaped maze for one run of algorithm. The exit is marked in red. Green boxes show intervals for $x,y$ and the horizontal and vertical arrows indicate the sign of the velocities $v_x$ and $v_y$, respectively. No arrows indicate there are no split across $v_x$ or $v_y$.
  • Figure 3: Average success rate on the U-shaped Maze (over 20 runs).
  • Figure 4: Goal space representations for the 4-Rooms Maze
  • Figure 5: Average success rate on the 4-Rooms Maze (over 20 runs). -T refers to the transferred versions of the algorithms.