Online Reinforcement Learning with Passive Memory

Anay Pattanaik; Lav R. Varshney

Online Reinforcement Learning with Passive Memory

Anay Pattanaik, Lav R. Varshney

TL;DR

It is shown that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal, and results show that quality of passive memory determines sub-optimality of the incurred regret.

Abstract

This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal. Results show that the quality of passive memory determines sub-optimality of the incurred regret. The proposed approach and results hold in both continuous and discrete state-action spaces.

Online Reinforcement Learning with Passive Memory

TL;DR

Abstract

Online Reinforcement Learning with Passive Memory

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (10)