The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure

Tyler Sam; Yudong Chen; Christina Lee Yu

The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure

Tyler Sam, Yudong Chen, Christina Lee Yu

TL;DR

This work considers the problem of transferring a latent low rank representation when the source and target MDPs have transition kernels with Tucker rank, and introduces the transfer-ability coefficient $\alpha$ that measures the difficulty of representational transfer.

Abstract

Many reinforcement learning (RL) algorithms are too costly to use in practice due to the large sizes $S, A$ of the problem's state and action space. To resolve this issue, we study transfer RL with latent low rank structure. We consider the problem of transferring a latent low rank representation when the source and target MDPs have transition kernels with Tucker rank $(S , d, A )$, $(S , S , d), (d, S, A )$, or $(d , d , d )$. In each setting, we introduce the transfer-ability coefficient $α$ that measures the difficulty of representational transfer. Our algorithm learns latent representations in each source MDP and then exploits the linear structure to remove the dependence on $S, A $, or $S A$ in the target MDP regret bound. We complement our positive results with information theoretic lower bounds that show our algorithms (excluding the ($d, d, d$) setting) are minimax-optimal with respect to $α$.

The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure

TL;DR

that measures the difficulty of representational transfer.

Abstract

Many reinforcement learning (RL) algorithms are too costly to use in practice due to the large sizes

of the problem's state and action space. To resolve this issue, we study transfer RL with latent low rank structure. We consider the problem of transferring a latent low rank representation when the source and target MDPs have transition kernels with Tucker rank

, or

. In each setting, we introduce the transfer-ability coefficient

that measures the difficulty of representational transfer. Our algorithm learns latent representations in each source MDP and then exploits the linear structure to remove the dependence on

, or

in the target MDP regret bound. We complement our positive results with information theoretic lower bounds that show our algorithms (excluding the (

) setting) are minimax-optimal with respect to

The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure

TL;DR

Abstract

The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (103)