Table of Contents
Fetching ...

Reinforcement Learning in Hyperbolic Spaces: Models and Experiments

Vladimir Jaćimović, Zinaid Kapić, Aladin Crnkić

TL;DR

This paper examines five setups where an agent (or two agents) seeks to explore unknown environment without any prior information and introduces statistical and dynamical models necessary for addressing problems of this kind and implements algorithms based on this framework.

Abstract

We examine five setups where an agent (or two agents) seeks to explore unknown environment without any prior information. Although seemingly very different, all of them can be formalized as Reinforcement Learning (RL) problems in hyperbolic spaces. More precisely, it is natural to endow the action spaces with the hyperbolic metric. We introduce statistical and dynamical models necessary for addressing problems of this kind and implement algorithms based on this framework. Throughout the paper we view RL through the lens of the black-box optimization.

Reinforcement Learning in Hyperbolic Spaces: Models and Experiments

TL;DR

This paper examines five setups where an agent (or two agents) seeks to explore unknown environment without any prior information and introduces statistical and dynamical models necessary for addressing problems of this kind and implements algorithms based on this framework.

Abstract

We examine five setups where an agent (or two agents) seeks to explore unknown environment without any prior information. Although seemingly very different, all of them can be formalized as Reinforcement Learning (RL) problems in hyperbolic spaces. More precisely, it is natural to endow the action spaces with the hyperbolic metric. We introduce statistical and dynamical models necessary for addressing problems of this kind and implement algorithms based on this framework. Throughout the paper we view RL through the lens of the black-box optimization.

Paper Structure

This paper contains 23 sections, 3 theorems, 18 equations, 15 figures.

Key Result

Proposition 1

Let $z_1(t),\dots,z_N(t)$ be trajectories satisfying the system global_swarm. There exists a one-parametric family $g_t$ of transformations of the form Mobius, such that

Figures (15)

  • Figure 1: Illustration of the frog problem.
  • Figure 2: Multi-layer graph with three layers.
  • Figure 3: Reward setting in the directional labyrinth.
  • Figure 4: Reward setting of the problem \ref{['two_agents_plane']}.
  • Figure 5: Position of points at the moments $T=0,1,2$ with corresponding distributions: a) ${\cal N}(2.31, 1.85)$, ${\cal N}(-0.07, 0.73)$, ${\cal N}(-0.06, 0.37)$, ${\cal N}(0.05, 0.79)$, ${\cal N}(0.12, 0.44)$; b) ${\cal N}(1.56, 0.03)$, ${\cal N}(2.42, 0.39)$, ${\cal N}(0.11, 0.20)$, ${\cal N}(2.41, 0.61)$, ${\cal N}(0.03, 0.23)$, and c) ${\cal N}(1.17, 0.00)$, ${\cal N}(1.80, 0.01)$, ${\cal N}(0.07, 0.04)$, ${\cal N}(1.96, 0.01)$, ${\cal N}(-0.08, 0.07)$.
  • ...and 10 more figures

Theorems & Definitions (5)

  • Proposition 1
  • Proposition 2
  • Remark 1
  • Remark 2
  • Lemma 3