Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models
Yongxian Wei, Zixuan Hu, Li Shen, Zhenyi Wang, Yu Li, Chun Yuan, Dacheng Tao
TL;DR
This work addresses data-free meta-learning (DFML) when pre-trained models are heterogeneous. It reveals a heterogeneity-homogeneity trade-off where diverse models can both regularize and interfere with learning shared representations, and it proposes Task Groupings Regularization to exploit heterogeneity while mitigating conflicts. The method groups dissimilar pre-trained models via task-space embeddings computed from the Fisher Information Matrix and applies implicit gradient regularization (IGR) within each group to align gradient directions across tasks. Empirical results across CIFAR-FS, miniImageNet, and CUB, including multi-domain and multi-architecture settings, show that the approach consistently surpasses baselines, demonstrating robust generalization and practical impact for data-free meta-learning in heterogeneous environments.
Abstract
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data, enabling the rapid adaptation to new unseen tasks. Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts. In this paper, we empirically and theoretically identify and analyze the model heterogeneity in DFML. We find that model heterogeneity introduces a heterogeneity-homogeneity trade-off, where homogeneous models reduce task conflicts but also increase the overfitting risk. Balancing this trade-off is crucial for learning shared representations across tasks. Based on our findings, we propose Task Groupings Regularization that benefits from model heterogeneity by grouping and aligning conflicting tasks. Specifically, we embed pre-trained models into a task space to compute dissimilarity, and group heterogeneous models together based on this measure. Then, we introduce implicit gradient regularization within each group to mitigate potential conflicts. By encouraging a gradient direction suitable for all tasks, the meta-model captures shared representations that generalize across tasks. Comprehensive experiments showcase the superiority of our approach in multiple benchmarks, effectively tackling the model heterogeneity in challenging multi-domain and multi-architecture scenarios.
