Meta-Learning and Meta-Reinforcement Learning - Tracing the Path towards DeepMind's Adaptive Agent
Björn Hoppmann, Christoph Scholz
TL;DR
The paper addresses rapid adaptation to novel tasks by providing a rigorous task-based formalism for meta-learning and meta-RL and by tracing a timeline of landmark algorithms from MAML to DeepMind's Adaptive Agent (ADA). It unifies these approaches under a two-stage framework: inner task-specific learning guided by a meta-parameter $\varphi$ and outer meta-optimization via a meta-loss, with explicit definitions for performance measures like $\mathcal{L}_{gen}^{meta}$ and adaptation speed. Its main contributions are the formal derivations of meta-learning paradigms, the comprehensive timeline of gradient-based, memory-based, and transformer-based meta-RL methods, and the analysis of techniques such as distillation and automated curriculum learning that scale to generalist agents. The work highlights the shift toward large-scale foundation-model-like meta-RL systems and discusses practical implications, benchmarking challenges, and ethical considerations as these agents move toward real-world deployment.
Abstract
Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability that standard machine learning models struggle to replicate due to their reliance on task-specific training. Meta-learning overcomes this limitation by allowing models to acquire transferable knowledge from various tasks, enabling rapid adaptation to new challenges with minimal data. This survey provides a rigorous, task-based formalization of meta-learning and meta-reinforcement learning and uses that paradigm to chronicle the landmark algorithms that paved the way for DeepMind's Adaptive Agent, consolidating the essential concepts needed to understand the Adaptive Agent and other generalist approaches.
