DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
Irina Higgins, Arka Pal, Andrei A. Rusu, Loic Matthey, Christopher P Burgess, Alexander Pritzel, Matthew Botvinick, Charles Blundell, Alexander Lerchner
TL;DR
The paper addresses domain adaptation in reinforcement learning by introducing DARLA, a three-stage agent that first learns to perceive the environment through disentangled representations, then learns a robust source policy, and finally transfers zero-shot to target domains. DARLA uses a beta-VAE framework enhanced with perceptual similarity loss via a pre-trained denoising autoencoder to obtain factorised latent states that generalise across domain shifts. Empirical results across DeepMind Lab, MuJoCo, and sim2real tasks show significant, consistent gains in zero-shot transfer over baselines and across multiple RL algorithms, with a strong correlation between representation disentanglement and transfer quality. The work demonstrates that learning disentangled vision is a key lever for robust domain adaptation in deep RL and offers a broadly applicable, model-agnostic pipeline for improved transfer.
Abstract
Domain adaptation is an important open problem in deep reinforcement learning (RL). In many scenarios of interest data is hard to obtain, so agents may learn a source policy in a setting where data is readily available, with the hope that it generalises well to the target domain. We propose a new multi-stage RL agent, DARLA (DisentAngled Representation Learning Agent), which learns to see before learning to act. DARLA's vision is based on learning a disentangled representation of the observed environment. Once DARLA can see, it is able to acquire source policies that are robust to many domain shifts - even with no access to the target domain. DARLA significantly outperforms conventional baselines in zero-shot domain adaptation scenarios, an effect that holds across a variety of RL environments (Jaco arm, DeepMind Lab) and base RL algorithms (DQN, A3C and EC).
