Hacking Task Confounder in Meta-Learning
Jingyao Wang, Yi Ren, Zeen Song, Jianqi Zhang, Changwen Zheng, Wenwen Qiang
TL;DR
The paper identifies Task Confounders as cross-task spurious correlations that degrade meta-learning generalization and analyzes them with Structural Causal Models. It proposes MetaCRL, a plug-and-play causal representation learner with a Disentangling Module and a Causal Module that enforce decoupled generating factors and invariant causality through bi-level optimization. Across sinusoid regression, image classification, drug activity prediction, and pose estimation, MetaCRL consistently yields state-of-the-art results and reduces negative transfer, validating the importance of causal representation in meta-learning. The approach offers a practical pathway to more robust meta-learners by explicitly decoupling task-specific factors and enforcing invariance to distribution shifts during training.
Abstract
Meta-learning enables rapid generalization to new tasks by learning knowledge from various tasks. It is intuitively assumed that as the training progresses, a model will acquire richer knowledge, leading to better generalization performance. However, our experiments reveal an unexpected result: there is negative knowledge transfer between tasks, affecting generalization performance. To explain this phenomenon, we conduct Structural Causal Models (SCMs) for causal analysis. Our investigation uncovers the presence of spurious correlations between task-specific causal factors and labels in meta-learning. Furthermore, the confounding factors differ across different batches. We refer to these confounding factors as "Task Confounders". Based on these findings, we propose a plug-and-play Meta-learning Causal Representation Learner (MetaCRL) to eliminate task confounders. It encodes decoupled generating factors from multiple tasks and utilizes an invariant-based bi-level optimization mechanism to ensure their causality for meta-learning. Extensive experiments on various benchmark datasets demonstrate that our work achieves state-of-the-art (SOTA) performance.
