Offline Multitask Representation Learning for Reinforcement Learning

Haque Ishfaq; Thanh Nguyen-Tang; Songtao Feng; Raman Arora; Mengdi Wang; Ming Yin; Doina Precup

Offline Multitask Representation Learning for Reinforcement Learning

Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin, Doina Precup

TL;DR

The theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

Abstract

We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation. We theoretically investigate offline multitask low-rank RL, and propose a new algorithm called MORL for offline multitask representation learning. Furthermore, we examine downstream RL in reward-free, offline and online scenarios, where a new task is introduced to the agent that shares the same representation as the upstream offline tasks. Our theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

Offline Multitask Representation Learning for Reinforcement Learning

TL;DR

The theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

Abstract

Paper Structure (59 sections, 41 theorems, 175 equations, 1 table, 5 algorithms)

This paper contains 59 sections, 41 theorems, 175 equations, 1 table, 5 algorithms.

Introduction
Preliminary
Episodic MDP.
Offline Multitask RL with Downstream Learning
Upstream Offline Multitask Representation Learning
Algorithm Design
Theoretical Result on Upstream Task
Downstream RL: Reward-free Exploration, Offline RL and Online RL
Relationship between upstream and downstream MDPs
Downstream Reward-Free RL
Downstream Offline and Online RL
Related Work
Conclusion
Additional Related Work
Omitted Algorithms
...and 44 more sections

Key Result

Theorem 3.3

Theorems & Definitions (49)

Definition 2.1: Low-rank MDPs
Definition 3.2: Multi-task relative condition number
Theorem 3.3
Remark 3.4
Lemma 3.5
Definition 4.2: $\epsilon$-approximate linear MDP jin2019provablycheng2022provable
Lemma 4.3
Theorem 4.4
Theorem 4.6
Theorem 4.7
...and 39 more

Offline Multitask Representation Learning for Reinforcement Learning

TL;DR

Abstract

Offline Multitask Representation Learning for Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (49)