When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications
Zequn Liu, Ruiyi Zhang, Yiping Song, Wei Ju, Ming Zhang
TL;DR
The paper investigates when Model-Agnostic Meta-Learning (MAML) yields the best results in NLP by examining data quantity, task similarity, and the balance between a general language model and task-specific adaptation. Using meta-learning with inner and outer optimization updates, the study analyzes performance across four NLP datasets, including text classification and personalized dialogue generation. Key findings show that MAML provides the strongest advantages with small per-task data and dissimilar tasks, while highly similar tasks or abundant data reduce its relative gains, sometimes favoring standard fine-tuning. The results offer practical guidance for applying MAML in NLP, emphasizing regime-aware initialization and simple fine-tuning strategies rather than per-task customization, to achieve robust transfer with limited data.
Abstract
Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning method, is successfully employed in NLP applications including few-shot text classification and multi-domain low-resource language generation. Many impacting factors, including data quantity, similarity among tasks, and the balance between general language model and task-specific adaptation, can affect the performance of MAML in NLP, but few works have thoroughly studied them. In this paper, we conduct an empirical study to investigate these impacting factors and conclude when MAML works the best based on the experimental results.
