Table of Contents
Fetching ...

Offline Multitask Representation Learning for Reinforcement Learning

Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin, Doina Precup

TL;DR

The theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

Abstract

We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation. We theoretically investigate offline multitask low-rank RL, and propose a new algorithm called MORL for offline multitask representation learning. Furthermore, we examine downstream RL in reward-free, offline and online scenarios, where a new task is introduced to the agent that shares the same representation as the upstream offline tasks. Our theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

Offline Multitask Representation Learning for Reinforcement Learning

TL;DR

The theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

Abstract

We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation. We theoretically investigate offline multitask low-rank RL, and propose a new algorithm called MORL for offline multitask representation learning. Furthermore, we examine downstream RL in reward-free, offline and online scenarios, where a new task is introduced to the agent that shares the same representation as the upstream offline tasks. Our theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.
Paper Structure (59 sections, 41 theorems, 175 equations, 1 table, 5 algorithms)

This paper contains 59 sections, 41 theorems, 175 equations, 1 table, 5 algorithms.

Key Result

Theorem 3.3

Theorems & Definitions (49)

  • Definition 2.1: Low-rank MDPs
  • Definition 3.2: Multi-task relative condition number
  • Theorem 3.3
  • Remark 3.4
  • Lemma 3.5
  • Definition 4.2: $\epsilon$-approximate linear MDP jin2019provablycheng2022provable
  • Lemma 4.3
  • Theorem 4.4
  • Theorem 4.6
  • Theorem 4.7
  • ...and 39 more