Table of Contents
Fetching ...

Hereditary Geometric Meta-RL: Nonlocal Generalization via Task Symmetries

Paul Nitschke, Shahriar Talebi

TL;DR

This work shows that when the task space is inherited from the symmetries of the underlying system, the task space embeds into a subgroup of those symmetries whose actions are linearizable, connected, and compact--properties that enable efficient learning and inference at the test time.

Abstract

Meta-Reinforcement Learning (Meta-RL) commonly generalizes via smoothness in the task encoding. While this enables local generalization around each training task, it requires dense coverage of the task space and leaves richer task space structure untapped. In response, we develop a geometric perspective that endows the task space with a "hereditary geometry" induced by the inherent symmetries of the underlying system. Concretely, the agent reuses a policy learned at the train time by transforming states and actions through actions of a Lie group. This converts Meta-RL into symmetry discovery rather than smooth extrapolation, enabling the agent to generalize to wider regions of the task space. We show that when the task space is inherited from the symmetries of the underlying system, the task space embeds into a subgroup of those symmetries whose actions are linearizable, connected, and compact--properties that enable efficient learning and inference at the test time. To learn these structures, we develop a differential symmetry discovery method. This collapses functional invariance constraints and thereby improves numerical stability and sample efficiency over functional approaches. Empirically, on a two-dimensional navigation task, our method efficiently recovers the ground-truth symmetry and generalizes across the entire task space, while a common baseline generalizes only near training tasks.

Hereditary Geometric Meta-RL: Nonlocal Generalization via Task Symmetries

TL;DR

This work shows that when the task space is inherited from the symmetries of the underlying system, the task space embeds into a subgroup of those symmetries whose actions are linearizable, connected, and compact--properties that enable efficient learning and inference at the test time.

Abstract

Meta-Reinforcement Learning (Meta-RL) commonly generalizes via smoothness in the task encoding. While this enables local generalization around each training task, it requires dense coverage of the task space and leaves richer task space structure untapped. In response, we develop a geometric perspective that endows the task space with a "hereditary geometry" induced by the inherent symmetries of the underlying system. Concretely, the agent reuses a policy learned at the train time by transforming states and actions through actions of a Lie group. This converts Meta-RL into symmetry discovery rather than smooth extrapolation, enabling the agent to generalize to wider regions of the task space. We show that when the task space is inherited from the symmetries of the underlying system, the task space embeds into a subgroup of those symmetries whose actions are linearizable, connected, and compact--properties that enable efficient learning and inference at the test time. To learn these structures, we develop a differential symmetry discovery method. This collapses functional invariance constraints and thereby improves numerical stability and sample efficiency over functional approaches. Empirically, on a two-dimensional navigation task, our method efficiently recovers the ground-truth symmetry and generalizes across the entire task space, while a common baseline generalizes only near training tasks.
Paper Structure (11 sections, 5 theorems, 41 equations, 3 figures)

This paper contains 11 sections, 5 theorems, 41 equations, 3 figures.

Key Result

lemma 1

Assume there exists a Lie group $\Gnormal$ with linear left actions $L_\gnormal: S \rightarrow S$ and $K_\gnormal: A \rightarrow A$ such that for all $s,s' \in S, a \in A$. Then, the geometry in $\mathbb{M}$ is hereditary.

Figures (3)

  • Figure 1: Illustration of the $2$-D navigation task. After learning to navigate from the origin $s_0$ to the goal positions $\taskembedding_0$ and $\taskembedding_1$, the agent aims to generalize their knowledge to navigate to the unseen location $\taskembedding_2$ at the test time.
  • Figure 2: Differential (green) and Functional (blue) symmetry discovery agents against an Oracle (red) evaluating either the differential loss (left) or the functional loss (right) over time---lower is better. The differential symmetry discovery (green) is an order of magnitude more sample efficient and stable.
  • Figure 3: Generalization in the $2$-D navigation task: The regret versus the distance to closest training task for our geometric approach (green) and CCM (blue)---lower is better. The CCM agent generalizes well to nearby tasks but collapses for tasks distant to the training set while the geometric agent generalizes within the entire task space.

Theorems & Definitions (15)

  • definition 1: Linear left actions
  • definition 2: Hereditary Geometry
  • lemma 1
  • proof
  • proof
  • definition 3: Symmetric MDP
  • lemma 2
  • definition 4: Compatible Symmetry
  • theorem 1: Hereditary Geometry from Symmetry
  • proof
  • ...and 5 more