Table of Contents
Fetching ...

Learning Agents With Prioritization and Parameter Noise in Continuous State and Action Space

Rajesh Mangannavar, Gopalakrishnan Srinivasaraghavan

TL;DR

This paper introduces a prioritized form of a combination of state-of-the-art approaches such as Deep Q-learning (DQN) and Deep Deterministic Policy Gradient (DDPG) to outperform the earlier results for continuous state and action space problems.

Abstract

Among the many variants of RL, an important class of problems is where the state and action spaces are continuous -- autonomous robots, autonomous vehicles, optimal control are all examples of such problems that can lend themselves naturally to reinforcement based algorithms, and have continuous state and action spaces. In this paper, we introduce a prioritized form of a combination of state-of-the-art approaches such as Deep Q-learning (DQN) and Deep Deterministic Policy Gradient (DDPG) to outperform the earlier results for continuous state and action space problems. Our experiments also involve the use of parameter noise during training resulting in more robust deep RL models outperforming the earlier results significantly. We believe these results are a valuable addition for continuous state and action space problems.

Learning Agents With Prioritization and Parameter Noise in Continuous State and Action Space

TL;DR

This paper introduces a prioritized form of a combination of state-of-the-art approaches such as Deep Q-learning (DQN) and Deep Deterministic Policy Gradient (DDPG) to outperform the earlier results for continuous state and action space problems.

Abstract

Among the many variants of RL, an important class of problems is where the state and action spaces are continuous -- autonomous robots, autonomous vehicles, optimal control are all examples of such problems that can lend themselves naturally to reinforcement based algorithms, and have continuous state and action spaces. In this paper, we introduce a prioritized form of a combination of state-of-the-art approaches such as Deep Q-learning (DQN) and Deep Deterministic Policy Gradient (DDPG) to outperform the earlier results for continuous state and action space problems. Our experiments also involve the use of parameter noise during training resulting in more robust deep RL models outperforming the earlier results significantly. We believe these results are a valuable addition for continuous state and action space problems.

Paper Structure

This paper contains 22 sections, 4 equations, 3 figures, 1 algorithm.

Figures (3)

  • Figure 1: Prioritized DDPG vs DDPG
  • Figure 2: Prioritized DDPG across all noi ses - adaptive-param, uncorrelated, co related and with no noise
  • Figure 3: Prioritized DDPG with noise vs DDPG