A Survey Analyzing Generalization in Deep Reinforcement Learning

Ezgi Korkmaz

A Survey Analyzing Generalization in Deep Reinforcement Learning

Ezgi Korkmaz

TL;DR

This paper explains the fundamental reasons why deep reinforcement learning policies encounter overfitting problems that limit their generalization capabilities, and categorize and explain the manifold solution approaches to increase generalization, and overcome overfitting in deep reinforcement learning policies.

Abstract

Reinforcement learning research obtained significant success and attention with the utilization of deep neural networks to solve problems in high dimensional state or action spaces. While deep reinforcement learning policies are currently being deployed in many different fields from medical applications to large language models, there are still ongoing questions the field is trying to answer on the generalization capabilities of deep reinforcement learning policies. In this paper, we will formalize and analyze generalization in deep reinforcement learning. We will explain the fundamental reasons why deep reinforcement learning policies encounter overfitting problems that limit their generalization capabilities. Furthermore, we will categorize and explain the manifold solution approaches to increase generalization, and overcome overfitting in deep reinforcement learning policies. From exploration to adversarial analysis and from regularization to robustness our paper provides an analysis on a wide range of subfields within deep reinforcement learning with a broad scope and in-depth view. We believe our study can provide a compact guideline for the current advancements in deep reinforcement learning, and help to construct robust deep neural policies with higher generalization skills.

A Survey Analyzing Generalization in Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (22 sections, 20 equations, 4 figures, 5 tables)

This paper contains 22 sections, 20 equations, 4 figures, 5 tables.

Introduction
Preliminaries on Deep Reinforcement Learning
How to Achieve Generalization?
Generic Reinforcement Learning Algorithm
Base Generalization in Deep Reinforcement Learning
Algorithmic Generalization
Generalization Through Rewards
Generalization Through Observations
Generalization Through Environment Dynamics
Generalization Through Policy
Assessing Generalization
Roots of Overestimation in Deep Reinforcement Learning
The Role of Exploration in Overfitting
Regularization
Data Augmentation
...and 7 more sections

Figures (4)

Figure 1: Robust adversarial reinforcement learning proposed in pinto17. This paper proposes the zero-sum game to model the relationship between the agent and the adversary while focusing on introducing disturbances to the environment dynamics. Here the empirical studies are conducted in the MuJoCo environment.
Figure 2: State transformation generalization under adversarial perspective in the Arcade Learning Environment korkmaz2023aaai. Note that under the adversarial influence direction of research, the state transformation generalization is constrained by the imperceptibility of the transformations. Columns: base frame, shifting, perspective transformation, blurring, discrete cosine transform artifacts, brightness and contrast. Up: JamesBond. Down: BankHeist.
Figure 3: Meta training of the learned policy gradient that have been described in oh20. Right: The learned policy gradient algorithm that has been trained in toy examples can generalize to more complex environment such as the Arcade Learning Environment.
Figure 4: Transfer in reinforcement learning as has been described in gamrian19 that falls under the generalization through observation category explained in Definition \ref{['def:stateperturbing']}. The frames are taken from Breakout game in the Arcade Learning Environment. The left frames represent the target task and the right frames represents the source tasks generated via generative adversarial networks.

Theorems & Definitions (8)

Definition 3.1: Generic reinforcement learning algorithm
Definition 3.2: Base generalization
Definition 3.3: Algorithmic generalization
Definition 3.4: Rewards transforming generalization
Definition 3.5: State transforming generalization
Definition 3.6: Transition probability transforming generalization
Definition 3.7: Policy transforming generalization
Definition 3.8: Generalization testing

A Survey Analyzing Generalization in Deep Reinforcement Learning

TL;DR

Abstract

A Survey Analyzing Generalization in Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (8)