Table of Contents
Fetching ...

Efficient Data-Free Model Stealing with Label Diversity

Yiyong Liu, Rui Wen, Michael Backes, Yang Zhang

TL;DR

The paper reframes data-free model stealing by highlighting sample diversity, proposing DB-DFMS which trains a generator to produce diverse inputs across all classes and a clone model to imitate a victim under black-box access. The generator uses a diversity loss based on the entropy of the clone’s predictions, while the clone is trained with an L1-based objective approximating logits from probabilities, enabling an efficient two-player training loop. Empirical results on CIFAR-10, SVHN, and CelebA show DB-DFMS matches or surpasses state-of-the-art data-free methods with lower query budgets and computation, and remains effective across clone-architecture variations and unbalanced data. The work demonstrates that maximizing diversity is a key, practical lever for model stealing, with implications for both attackers and defenders and suggests avenues for further improving data-free strategies and defenses.

Abstract

Machine learning as a Service (MLaaS) allows users to query the machine learning model in an API manner, which provides an opportunity for users to enjoy the benefits brought by the high-performance model trained on valuable data. This interface boosts the proliferation of machine learning based applications, while on the other hand, it introduces the attack surface for model stealing attacks. Existing model stealing attacks have relaxed their attack assumptions to the data-free setting, while keeping the effectiveness. However, these methods are complex and consist of several components, which obscure the core on which the attack really depends. In this paper, we revisit the model stealing problem from a diversity perspective and demonstrate that keeping the generated data samples more diverse across all the classes is the critical point for improving the attack performance. Based on this conjecture, we provide a simplified attack framework. We empirically signify our conjecture by evaluating the effectiveness of our attack, and experimental results show that our approach is able to achieve comparable or even better performance compared with the state-of-the-art method. Furthermore, benefiting from the absence of redundant components, our method demonstrates its advantages in attack efficiency and query budget.

Efficient Data-Free Model Stealing with Label Diversity

TL;DR

The paper reframes data-free model stealing by highlighting sample diversity, proposing DB-DFMS which trains a generator to produce diverse inputs across all classes and a clone model to imitate a victim under black-box access. The generator uses a diversity loss based on the entropy of the clone’s predictions, while the clone is trained with an L1-based objective approximating logits from probabilities, enabling an efficient two-player training loop. Empirical results on CIFAR-10, SVHN, and CelebA show DB-DFMS matches or surpasses state-of-the-art data-free methods with lower query budgets and computation, and remains effective across clone-architecture variations and unbalanced data. The work demonstrates that maximizing diversity is a key, practical lever for model stealing, with implications for both attackers and defenders and suggests avenues for further improving data-free strategies and defenses.

Abstract

Machine learning as a Service (MLaaS) allows users to query the machine learning model in an API manner, which provides an opportunity for users to enjoy the benefits brought by the high-performance model trained on valuable data. This interface boosts the proliferation of machine learning based applications, while on the other hand, it introduces the attack surface for model stealing attacks. Existing model stealing attacks have relaxed their attack assumptions to the data-free setting, while keeping the effectiveness. However, these methods are complex and consist of several components, which obscure the core on which the attack really depends. In this paper, we revisit the model stealing problem from a diversity perspective and demonstrate that keeping the generated data samples more diverse across all the classes is the critical point for improving the attack performance. Based on this conjecture, we provide a simplified attack framework. We empirically signify our conjecture by evaluating the effectiveness of our attack, and experimental results show that our approach is able to achieve comparable or even better performance compared with the state-of-the-art method. Furthermore, benefiting from the absence of redundant components, our method demonstrates its advantages in attack efficiency and query budget.
Paper Structure (24 sections, 4 equations, 7 figures, 12 tables)

This paper contains 24 sections, 4 equations, 7 figures, 12 tables.

Figures (7)

  • Figure 1: Workflow of DB-DFMS.
  • Figure 2: Entropy of generated data samples according to the prediction from victim model. The victim model is ResNet-34-8x trained on CIFAR-10 and clone model is ResNet-18-8x.
  • Figure 3: t-SNE representations for the embedding of 512 randomly generated data samples with different data-free model stealing methods. The Victim model is ResNet-34-8x trained on CIFAR-10 and the clone model is ResNet-18-8x.
  • Figure 4: Generated data Samples and visualization of Grad-CAM from Random Noise, DFME, DFMS-SL and DB-DFMS (from top to bottom) For models trained on SVHN.
  • Figure 5: Generated data Samples and visualization of Grad-CAM from Random Noise, DFME, DFMS-SL and DB-DFMS (from top to bottom) For models trained on CelebA.
  • ...and 2 more figures