Table of Contents
Fetching ...

CaBaGe: Data-Free Model Extraction using ClAss BAlanced Generator Ensemble

Jonathan Rosenthal, Shanchao Liang, Kevin Zhang, Lin Tan

TL;DR

CaBaGE tackles data-free model extraction for MLaaS black-box systems by introducing three synergistic components—generator ensemble, selective query, and class-balanced difficulty-weighted replay—and by operating under a class-agnostic setting that learns the number of classes on the fly. The method alternates between training a clone ensemble and generating synthetic data to query the victim, all within a constrained query budget, to maximize extraction fidelity. Empirical results across seven datasets show substantial accuracy gains and improved query efficiency compared to state-of-the-art data-free methods, including large gains on simple tasks and meaningful boosts on complex ones, as well as significant reductions in the required number of queries. The work highlights practical vulnerabilities in MLaaS deployments and outlines limitations and future directions, such as robust hyperparameter tuning and extension to non-image data and imbalanced domains.

Abstract

Machine Learning as a Service (MLaaS) is often provided as a pay-per-query, black-box system to clients. Such a black-box approach not only hinders open replication, validation, and interpretation of model results, but also makes it harder for white-hat researchers to identify vulnerabilities in the MLaaS systems. Model extraction is a promising technique to address these challenges by reverse-engineering black-box models. Since training data is typically unavailable for MLaaS models, this paper focuses on the realistic version of it: data-free model extraction. We propose a data-free model extraction approach, CaBaGe, to achieve higher model extraction accuracy with a small number of queries. Our innovations include (1) a novel experience replay for focusing on difficult training samples; (2) an ensemble of generators for steadily producing diverse synthetic data; and (3) a selective filtering process for querying the victim model with harder, more balanced samples. In addition, we create a more realistic setting, for the first time, where the attacker has no knowledge of the number of classes in the victim training data, and create a solution to learn the number of classes on the fly. Our evaluation shows that CaBaGe outperforms existing techniques on seven datasets -- MNIST, FMNIST, SVHN, CIFAR-10, CIFAR-100, ImageNet-subset, and Tiny ImageNet -- with an accuracy improvement of the extracted models by up to 43.13%. Furthermore, the number of queries required to extract a clone model matching the final accuracy of prior work is reduced by up to 75.7%.

CaBaGe: Data-Free Model Extraction using ClAss BAlanced Generator Ensemble

TL;DR

CaBaGE tackles data-free model extraction for MLaaS black-box systems by introducing three synergistic components—generator ensemble, selective query, and class-balanced difficulty-weighted replay—and by operating under a class-agnostic setting that learns the number of classes on the fly. The method alternates between training a clone ensemble and generating synthetic data to query the victim, all within a constrained query budget, to maximize extraction fidelity. Empirical results across seven datasets show substantial accuracy gains and improved query efficiency compared to state-of-the-art data-free methods, including large gains on simple tasks and meaningful boosts on complex ones, as well as significant reductions in the required number of queries. The work highlights practical vulnerabilities in MLaaS deployments and outlines limitations and future directions, such as robust hyperparameter tuning and extension to non-image data and imbalanced domains.

Abstract

Machine Learning as a Service (MLaaS) is often provided as a pay-per-query, black-box system to clients. Such a black-box approach not only hinders open replication, validation, and interpretation of model results, but also makes it harder for white-hat researchers to identify vulnerabilities in the MLaaS systems. Model extraction is a promising technique to address these challenges by reverse-engineering black-box models. Since training data is typically unavailable for MLaaS models, this paper focuses on the realistic version of it: data-free model extraction. We propose a data-free model extraction approach, CaBaGe, to achieve higher model extraction accuracy with a small number of queries. Our innovations include (1) a novel experience replay for focusing on difficult training samples; (2) an ensemble of generators for steadily producing diverse synthetic data; and (3) a selective filtering process for querying the victim model with harder, more balanced samples. In addition, we create a more realistic setting, for the first time, where the attacker has no knowledge of the number of classes in the victim training data, and create a solution to learn the number of classes on the fly. Our evaluation shows that CaBaGe outperforms existing techniques on seven datasets -- MNIST, FMNIST, SVHN, CIFAR-10, CIFAR-100, ImageNet-subset, and Tiny ImageNet -- with an accuracy improvement of the extracted models by up to 43.13%. Furthermore, the number of queries required to extract a clone model matching the final accuracy of prior work is reduced by up to 75.7%.
Paper Structure (30 sections, 4 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 30 sections, 4 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of CaBaGE. Our three novel components are colored in yellow. CB-DW Replay is the Class-Balanced Difficulty-Weighted Replay.
  • Figure 2: Fidelity extraction curves for CIFAR-10 and CIFAR-100
  • Figure 3: Accuracy extraction curves for CIFAR-10 and CIFAR-100
  • Figure 4: Performance comparison of DisGUIDE's replay and CaBaGE, with different replay iterations used
  • Figure 5: Final accuracy comparison for ResNet34 extraction on CIFAR-100 under the relaxed-budget setting. Method tested is DisGUIDE with only the addition of a generator ensemble. Effect of varying the ensemble size.
  • ...and 1 more figures