FREE: Faster and Better Data-Free Meta-Learning
Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, Dacheng Tao
TL;DR
The paper addresses data-free meta-learning when training data is unavailable, focusing on efficiency and model heterogeneity. It proposes FREE, combining a meta-generator (FIve) that rapidly adapts to each pre-trained model in $k$ steps and a gradient-aligned meta-learner (BelL) that uses implicit gradient alignment and cross-task distillation to generalize to unseen tasks. Empirical results on mini-ImageNet, CIFAR-FS, and CUB show a ~20× speed-up in data recovery and consistent accuracy gains (approximately $1.42$–$4.78\%$) over state-of-the-art, including robust performance in multi-domain and multi-architecture settings. The approach advances privacy-preserving meta-learning by enabling fast reconstruction of task distributions across heterogeneous model pools and learning task-invariant representations for unseen tasks.
Abstract
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns. Current DFML methods primarily focus on the data recovery from these pre-trained models. However, they suffer from slow recovery speed and overlook gaps inherent in heterogeneous pre-trained models. In response to these challenges, we introduce the Faster and Better Data-Free Meta-Learning (FREE) framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks. Specifically, within the module Faster Inversion via Meta-Generator, each pre-trained model is perceived as a distinct task. The meta-generator can rapidly adapt to a specific task in just five steps, significantly accelerating the data recovery. Furthermore, we propose Better Generalization via Meta-Learner and introduce an implicit gradient alignment algorithm to optimize the meta-learner. This is achieved as aligned gradient directions alleviate potential conflicts among tasks from heterogeneous pre-trained models. Empirical experiments on multiple benchmarks affirm the superiority of our approach, marking a notable speed-up (20$\times$) and performance enhancement (1.42%$\sim$4.78%) in comparison to the state-of-the-art.
