How to train your MAML
Antreas Antoniou, Harrison Edwards, Amos Storkey
TL;DR
The paper tackles the instability and inefficiency of Model Agnostic Meta-Learning (MAML) in few-shot learning. It introduces MAML++, a suite of six modifications (MSL, DA, BNRS, BNWB, LSLR, CA) to stabilize training, automate hyperparameter choices, and boost generalization. Empirical results on Omniglot and Mini-Imagenet demonstrate state-of-the-art performance across 5-way and 1/5-shot settings, with faster convergence and robustness to inner-loop updates. The findings highlight the importance of per-step parameter adaptations and per-step normalization in achieving fast, reliable few-shot learning across tasks.
Abstract
The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.
