Table of Contents
Fetching ...

Rethinking Meta-Learning from a Learning Lens

Jingyao Wang, Wenwen Qiang, Changwen Zheng, Hui Xiong, Gang Hua

TL;DR

This work reframes meta-learning from a purely initialization-centric view to a Learning lens, proposing that a meta-learning model comprises initialization layers plus a gradient-based meta-layer to balance capacity and data. It introduces TRLearner, a plug-and-play approach that learns task relations via an adaptive sampler and enforces relation-aware consistency across similar tasks to guide optimization. Theoretical results show TRLearner reduces excess risk and improves OOD generalization when task relations are accurate, and empirical results across regression, image classification, drug activity, pose prediction, and OOD benchmarks demonstrate consistent improvements over strong baselines. The combination of theory and diverse experiments suggests that explicitly leveraging inter-task relationships can significantly enhance generalization in few-shot and cross-domain settings, with practical applicability across domains.

Abstract

Meta-learning seeks to learn a well-generalized model initialization from training tasks to solve unseen tasks. From the "learning to learn" perspective, the quality of the initialization is modeled with one-step gradient decent in the inner loop. However, contrary to theoretical expectations, our empirical analysis reveals that this may expose meta-learning to underfitting. To bridge the gap between theoretical understanding and practical implementation, we reconsider meta-learning from the "Learning" lens. We propose that the meta-learning model comprises two interrelated components: parameters for model initialization and a meta-layer for task-specific fine-tuning. These components will lead to the risks of overfitting and underfitting depending on tasks, and their solutions, fewer parameters vs. more meta-layer, are often in conflict. To address this, we aim to regulate the task information the model receives without modifying the data or model structure. Our theoretical analysis indicates that models adapted to different tasks can mutually reinforce each other, highlighting the effective information. Based on this insight, we propose TRLearner, a plug-and-play method that leverages task relation to calibrate meta-learning. It first extracts task relation matrices and then applies relation-aware consistency regularization to guide optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness.

Rethinking Meta-Learning from a Learning Lens

TL;DR

This work reframes meta-learning from a purely initialization-centric view to a Learning lens, proposing that a meta-learning model comprises initialization layers plus a gradient-based meta-layer to balance capacity and data. It introduces TRLearner, a plug-and-play approach that learns task relations via an adaptive sampler and enforces relation-aware consistency across similar tasks to guide optimization. Theoretical results show TRLearner reduces excess risk and improves OOD generalization when task relations are accurate, and empirical results across regression, image classification, drug activity, pose prediction, and OOD benchmarks demonstrate consistent improvements over strong baselines. The combination of theory and diverse experiments suggests that explicitly leveraging inter-task relationships can significantly enhance generalization in few-shot and cross-domain settings, with practical applicability across domains.

Abstract

Meta-learning seeks to learn a well-generalized model initialization from training tasks to solve unseen tasks. From the "learning to learn" perspective, the quality of the initialization is modeled with one-step gradient decent in the inner loop. However, contrary to theoretical expectations, our empirical analysis reveals that this may expose meta-learning to underfitting. To bridge the gap between theoretical understanding and practical implementation, we reconsider meta-learning from the "Learning" lens. We propose that the meta-learning model comprises two interrelated components: parameters for model initialization and a meta-layer for task-specific fine-tuning. These components will lead to the risks of overfitting and underfitting depending on tasks, and their solutions, fewer parameters vs. more meta-layer, are often in conflict. To address this, we aim to regulate the task information the model receives without modifying the data or model structure. Our theoretical analysis indicates that models adapted to different tasks can mutually reinforce each other, highlighting the effective information. Based on this insight, we propose TRLearner, a plug-and-play method that leverages task relation to calibrate meta-learning. It first extracts task relation matrices and then applies relation-aware consistency regularization to guide optimization. Extensive theoretical and empirical evaluations demonstrate its effectiveness.
Paper Structure (59 sections, 3 theorems, 41 equations, 9 figures, 6 tables)

This paper contains 59 sections, 3 theorems, 41 equations, 9 figures, 6 tables.

Key Result

theorem 1

Regardless of the correlation between the label variables $Y_i$ and $Y_j$, the classifier for task $\tau_i$ assigns non-zero weights for task-specific factors of task $\tau_j$ with importance $\zeta \propto \text{sim}(X_i,X_j)$ achieve better performance, where $\text{sim}(\cdot)$ is the similarity

Figures (9)

  • Figure 1: Reformulation of meta-learning model $\mathcal{F}_\theta$. (a) briefly shows how to model $\mathcal{F}_\theta$. (b) show the learning process under the modeling of $\mathcal{F}_\theta$ in (a). The black solid line represents the forward computation process, while the red dashed line indicates the backward propagation process.
  • Figure 2: Motivating evidence about the performance of the model on $\mathcal{D}_1$-$\mathcal{D}_4$. Each group of tasks has a different sampling score, i.e., 0.74, 0.68, 0.31, and 0.29 respectively. Higher sampling scores indicate greater task complexity.
  • Figure 3: Illustration of meta-learning with TRLearner. TRLearner uses the task relation matrix $\mathcal{M}$ and the regularization term $\mathcal{L}_{TR}$ to calibrate optimization. The black line is for the original meta-learning process, while the red line represents the calibration by TRLearner. The pseudo-code is provided in Algorithm \ref{['alg:1']}.
  • Figure 4: Effect of regularization $\mathcal{L}_{TR}$ on miniImagenet.
  • Figure 5: Parameter sensitivity on miniImagenet.
  • ...and 4 more figures

Theorems & Definitions (3)

  • theorem 1
  • theorem 2
  • theorem 3