Table of Contents
Fetching ...

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, Dacheng Tao

TL;DR

The paper tackles meta-learning when training data are unavailable and all models are accessed only as black-box APIs. It introduces BiDf-MKD, a bi-level knowledge-distillation framework that learns a shared meta-initialization by (i) recovering label-conditional data from APIs via a generator and zero-order gradient estimates, (ii) performing inner-level knowledge transfer to task-specific models and outer-level meta-knowledge distillation to the meta-model, and (iii) employing boundary-focused query set recovery and task memory replay to enhance diversity and prevent knowledge vanishing. The authors define and address the knowledge-vanish issue in data-free meta-learning, propose a boundary query set recovery mechanism, and demonstrate three real-world evaluation scenarios (API-SS, API-SH, API-MH) with substantial gains on CIFAR-FS, MiniImageNet, and CUB. The approach preserves data privacy by avoiding real training data, is model-agnostic to accommodate arbitrary API architectures, and scales to limited API budgets through memory replay, offering a practical pathway for privacy-preserving meta-learning with MaaS APIs.

Abstract

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data. Existing DFML work can only meta-learn from (i) white-box and (ii) small-scale pre-trained models (iii) with the same architecture, neglecting the more practical setting where the users only have inference access to the APIs with arbitrary model architectures and model scale inside. To solve this issue, we propose a Bi-level Data-free Meta Knowledge Distillation (BiDf-MKD) framework to transfer more general meta knowledge from a collection of black-box APIs to one single meta model. Specifically, by just querying APIs, we inverse each API to recover its training data via a zero-order gradient estimator and then perform meta-learning via a novel bi-level meta knowledge distillation structure, in which we design a boundary query set recovery technique to recover a more informative query set near the decision boundary. In addition, to encourage better generalization within the setting of limited API budgets, we propose task memory replay to diversify the underlying task distribution by covering more interpolated tasks. Extensive experiments in various real-world scenarios show the superior performance of our BiDf-MKD framework.

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

TL;DR

The paper tackles meta-learning when training data are unavailable and all models are accessed only as black-box APIs. It introduces BiDf-MKD, a bi-level knowledge-distillation framework that learns a shared meta-initialization by (i) recovering label-conditional data from APIs via a generator and zero-order gradient estimates, (ii) performing inner-level knowledge transfer to task-specific models and outer-level meta-knowledge distillation to the meta-model, and (iii) employing boundary-focused query set recovery and task memory replay to enhance diversity and prevent knowledge vanishing. The authors define and address the knowledge-vanish issue in data-free meta-learning, propose a boundary query set recovery mechanism, and demonstrate three real-world evaluation scenarios (API-SS, API-SH, API-MH) with substantial gains on CIFAR-FS, MiniImageNet, and CUB. The approach preserves data privacy by avoiding real training data, is model-agnostic to accommodate arbitrary API architectures, and scales to limited API budgets through memory replay, offering a practical pathway for privacy-preserving meta-learning with MaaS APIs.

Abstract

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data. Existing DFML work can only meta-learn from (i) white-box and (ii) small-scale pre-trained models (iii) with the same architecture, neglecting the more practical setting where the users only have inference access to the APIs with arbitrary model architectures and model scale inside. To solve this issue, we propose a Bi-level Data-free Meta Knowledge Distillation (BiDf-MKD) framework to transfer more general meta knowledge from a collection of black-box APIs to one single meta model. Specifically, by just querying APIs, we inverse each API to recover its training data via a zero-order gradient estimator and then perform meta-learning via a novel bi-level meta knowledge distillation structure, in which we design a boundary query set recovery technique to recover a more informative query set near the decision boundary. In addition, to encourage better generalization within the setting of limited API budgets, we propose task memory replay to diversify the underlying task distribution by covering more interpolated tasks. Extensive experiments in various real-world scenarios show the superior performance of our BiDf-MKD framework.
Paper Structure (30 sections, 17 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 30 sections, 17 equations, 9 figures, 10 tables, 1 algorithm.

Figures (9)

  • Figure 1: According to the datasets and model architectures inside the APIs, we propose three real-world black-box scenarios for a complete and practical evaluation of black-box DFML. We are the first to propose a unified framework simultaneously applicable to all three scenarios without any change, thus greatly expanding the real-world application scope of black-box DFML.
  • Figure 2: The whole pipeline of our proposed BiDf-MKD framework. For each API $A_i$, we recover its training data starting from the random standard Gaussian noise $\boldsymbol{Z}_i$. By continually querying the black-box API $A_i$, we gradually update the noise to label-conditional data. We then split the recovered data into the support set $\boldsymbol{S}_i$ and query set $\boldsymbol{Q}_i$ to perform meta-learning via our bi-level meta knowledge distillation structure. Alternatively, we can perform task memory replay with MAML over more interpolated tasks.
  • Figure 3: Knowledge vanish issue of meta-learning occurs when the outer-level optimization can be ignored.
  • Figure 4: Histogram of the reported accuracy of APIs.
  • Figure 5: Effect of the number of APIs in API-SS scenario.
  • ...and 4 more figures

Theorems & Definitions (7)

  • Definition 4.1
  • Definition 1.1
  • Definition 1.2
  • Definition 1.3
  • Definition 1.4
  • Definition 1.5
  • Definition 1.6