Table of Contents
Fetching ...

Learning to Learn with Contrastive Meta-Objective

Shiguang Wu, Yaqing Wang, Yatao Bian, Quanming Yao

TL;DR

ConML introduces a universal, learner-agnostic contrastive meta-objective that operates on model-space representations to enhance alignment within tasks and discrimination across tasks, leveraging intrinsic task identity in mini-batch episodic meta-training. By combining the traditional episodic loss with L_c that balances inner-task and inter-task distances, ConML provides theoretical generalization benefits and can be integrated with optimization-, metric-, amortization-based learners, and even in-context learning. Empirically, ConML yields consistent improvements in few-shot image classification, cross-domain meta-learning, and ICL across diverse backbones and tasks, with modest computational overhead. The work demonstrates that task-level contrastive supervision can robustly improve fast adaptation and task-level generalization, while outlining future directions for sampling strategies, distance metrics, and representation design.

Abstract

Meta-learning enables learning systems to adapt quickly to new tasks, similar to humans. Different meta-learning approaches all work under/with the mini-batch episodic training framework. Such framework naturally gives the information about task identity, which can serve as additional supervision for meta-training to improve generalizability. We propose to exploit task identity as additional supervision in meta-training, inspired by the alignment and discrimination ability which is is intrinsic in human's fast learning. This is achieved by contrasting what meta-learners learn, i.e., model representations. The proposed ConML is evaluating and optimizing the contrastive meta-objective under a problem- and learner-agnostic meta-training framework. We demonstrate that ConML integrates seamlessly with existing meta-learners, as well as in-context learning models, and brings significant boost in performance with small implementation cost.

Learning to Learn with Contrastive Meta-Objective

TL;DR

ConML introduces a universal, learner-agnostic contrastive meta-objective that operates on model-space representations to enhance alignment within tasks and discrimination across tasks, leveraging intrinsic task identity in mini-batch episodic meta-training. By combining the traditional episodic loss with L_c that balances inner-task and inter-task distances, ConML provides theoretical generalization benefits and can be integrated with optimization-, metric-, amortization-based learners, and even in-context learning. Empirically, ConML yields consistent improvements in few-shot image classification, cross-domain meta-learning, and ICL across diverse backbones and tasks, with modest computational overhead. The work demonstrates that task-level contrastive supervision can robustly improve fast adaptation and task-level generalization, while outlining future directions for sampling strategies, distance metrics, and representation design.

Abstract

Meta-learning enables learning systems to adapt quickly to new tasks, similar to humans. Different meta-learning approaches all work under/with the mini-batch episodic training framework. Such framework naturally gives the information about task identity, which can serve as additional supervision for meta-training to improve generalizability. We propose to exploit task identity as additional supervision in meta-training, inspired by the alignment and discrimination ability which is is intrinsic in human's fast learning. This is achieved by contrasting what meta-learners learn, i.e., model representations. The proposed ConML is evaluating and optimizing the contrastive meta-objective under a problem- and learner-agnostic meta-training framework. We demonstrate that ConML integrates seamlessly with existing meta-learners, as well as in-context learning models, and brings significant boost in performance with small implementation cost.
Paper Structure (35 sections, 4 theorems, 28 equations, 7 figures, 7 tables, 9 algorithms)

This paper contains 35 sections, 4 theorems, 28 equations, 7 figures, 7 tables, 9 algorithms.

Key Result

Lemma 1

Denote $U_{p(\tau)}(\theta)=C_1\sqrt{\sup_{||v||\leq 1}{E_{\tau\sim p(\tau)}} {E_{(x,y)\sim \tau}}[\langle v,g(\{(x,y)\};\theta)\rangle^2]}+C_2$. There exists positive constants $C_1,C_2$ not related with $\theta$, satisfying $\forall \theta$, $\Delta\epsilon_{p(\tau)}(\theta) \leq U_{p(\tau)}(\thet

Figures (7)

  • Figure 1: ConML is performing contrastive learning in model space, to make the meta-learner itself able to align information from the same task (alignment) while discriminate different tasks to improve generalizability (discrimination).
  • Figure 2: The effect of distance function $\phi$, contrastive loss form $\mathcal{L}_c$, contrastive weight $\lambda$ .
  • Figure 3: Varying the number of in-context examples during inference of ICL.
  • Figure 4: Evaluation of ConML on synthetic few-shot regression.
  • Figure : Mini-Batch Episodic Training (with Validation Loss).
  • ...and 2 more figures

Theorems & Definitions (6)

  • Lemma 1
  • Theorem 1
  • Lemma 2: Upper Bound from maurer2016benefit
  • Lemma 3: Universal Approximation of MLP from hornik1991approximation
  • proof
  • proof