Table of Contents
Fetching ...

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Egor Cherepanov, Nikita Kachaev, Artem Zholus, Alexey K. Kovalev, Aleksandr I. Panov

TL;DR

The paper tackles inconsistencies in how memory is defined and evaluated in reinforcement learning under partial observability. It introduces a neuroscience-inspired taxonomy that separates Memory Decision Making (Memory DM) from Meta-Reinforcement Learning (Meta-RL) and distinguishes short-term memory ($K$) vs long-term memory through the correlation horizon ($\xi$) and an effective context ($K_{eff}$) governed by memory mechanisms $\mu(K)$. A robust experimental methodology is proposed to test LTM and STM in Memory DM using memory-intensive environments, with formal notions like the context border $\overline{K}$ to separate memory types; the approach is validated on Passive-T-Maze and Minigrid-Memory with memory-enhanced baselines. The results demonstrate that misconfigurations can lead to misleading conclusions about an agent's memory capabilities, whereas following the framework yields clearer, fair comparisons and practical guidance for designing memory-aware RL agents.

Abstract

The incorporation of memory into agents is essential for numerous tasks within the domain of Reinforcement Learning (RL). In particular, memory is paramount for tasks that require the utilization of past information, adaptation to novel environments, and improved sample efficiency. However, the term ``memory'' encompasses a wide range of concepts, which, coupled with the lack of a unified methodology for validating an agent's memory, leads to erroneous judgments about agents' memory capabilities and prevents objective comparison with other memory-enhanced agents. This paper aims to streamline the concept of memory in RL by providing practical precise definitions of agent memory types, such as long-term versus short-term memory and declarative versus procedural memory, inspired by cognitive science. Using these definitions, we categorize different classes of agent memory, propose a robust experimental methodology for evaluating the memory capabilities of RL agents, and standardize evaluations. Furthermore, we empirically demonstrate the importance of adhering to the proposed methodology when evaluating different types of agent memory by conducting experiments with different RL agents and what its violation leads to.

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

TL;DR

The paper tackles inconsistencies in how memory is defined and evaluated in reinforcement learning under partial observability. It introduces a neuroscience-inspired taxonomy that separates Memory Decision Making (Memory DM) from Meta-Reinforcement Learning (Meta-RL) and distinguishes short-term memory () vs long-term memory through the correlation horizon () and an effective context () governed by memory mechanisms . A robust experimental methodology is proposed to test LTM and STM in Memory DM using memory-intensive environments, with formal notions like the context border to separate memory types; the approach is validated on Passive-T-Maze and Minigrid-Memory with memory-enhanced baselines. The results demonstrate that misconfigurations can lead to misleading conclusions about an agent's memory capabilities, whereas following the framework yields clearer, fair comparisons and practical guidance for designing memory-aware RL agents.

Abstract

The incorporation of memory into agents is essential for numerous tasks within the domain of Reinforcement Learning (RL). In particular, memory is paramount for tasks that require the utilization of past information, adaptation to novel environments, and improved sample efficiency. However, the term ``memory'' encompasses a wide range of concepts, which, coupled with the lack of a unified methodology for validating an agent's memory, leads to erroneous judgments about agents' memory capabilities and prevents objective comparison with other memory-enhanced agents. This paper aims to streamline the concept of memory in RL by providing practical precise definitions of agent memory types, such as long-term versus short-term memory and declarative versus procedural memory, inspired by cognitive science. Using these definitions, we categorize different classes of agent memory, propose a robust experimental methodology for evaluating the memory capabilities of RL agents, and standardize evaluations. Furthermore, we empirically demonstrate the importance of adhering to the proposed methodology when evaluating different types of agent memory by conducting experiments with different RL agents and what its violation leads to.

Paper Structure

This paper contains 28 sections, 1 theorem, 4 equations, 9 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Let $\tilde{\mathcal{M}}_P$ be a memory-intensive environment and $K$ be an agents context length. Then there exists context memory border$\overline{K} \geq 1$ such that if $K \leq \overline{K}$ then the environment $\tilde{\mathcal{M}}_P$ is used to validate exclusively long-term memory in Memory D

Figures (9)

  • Figure 1: Declarative and procedural memory scheme. Red arrows show the information transfer for memorization, blue arrows show the direction of recall to the required information.
  • Figure 2: Long-term memory and short-term memory scheme. $t_e$ -- event used for decision-making start time, $\Delta t$ -- event duration, $t_r$ -- agent's recall time, $K$ -- agent's context length, $\xi$ -- correlation horizon. If an event is outside the context, long-term memory is needed for decision-making; if within the context, short-term memory suffices.
  • Figure 3: Classification of memory types of RL agents. While the Memory DM framework contrasts with Meta-RL, its formalism can also describe inner-loop tasks when they are POMDPs.
  • Figure 4: Success Rates for SAC-GPT-2 agent with LTM and STM for the Minigrid-Memory environment with map size $L=21$.
  • Figure 6: Memory-intensive environments for testing STM and LTM in Memory DM.
  • ...and 4 more figures

Theorems & Definitions (9)

  • Definition 1
  • Definition 2
  • Definition 3: Declarative and Procedural memory in RL
  • Definition 4: Memory DM types of memory
  • Definition 5: Memory-intensive environments
  • Theorem 1: On the context memory border
  • proof
  • Definition 6: Memory mechanisms
  • Definition 7: Meta-RL