Table of Contents
Fetching ...

Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing

Yunlong Zhao, Xiaoheng Deng, Yijing Liu, Xinjun Pei, Jiazhi Xia, Wei Chen

TL;DR

Superpixel Sample Gradient stealing (SPSG) is proposed for model stealing under the constraint of limited real samples and achieves accuracy, agreements, and adversarial success rate significantly surpassing the current state-of-the-art MS methods.

Abstract

Model stealing (MS) involves querying and observing the output of a machine learning model to steal its capabilities. The quality of queried data is crucial, yet obtaining a large amount of real data for MS is often challenging. Recent works have reduced reliance on real data by using generative models. However, when high-dimensional query data is required, these methods are impractical due to the high costs of querying and the risk of model collapse. In this work, we propose using sample gradients (SG) to enhance the utility of each real sample, as SG provides crucial guidance on the decision boundaries of the victim model. However, utilizing SG in the model stealing scenario faces two challenges: 1. Pixel-level gradient estimation requires extensive query volume and is susceptible to defenses. 2. The estimation of sample gradients has a significant variance. This paper proposes Superpixel Sample Gradient stealing (SPSG) for model stealing under the constraint of limited real samples. With the basic idea of imitating the victim model's low-variance patch-level gradients instead of pixel-level gradients, SPSG achieves efficient sample gradient estimation through two steps. First, we perform patch-wise perturbations on query images to estimate the average gradient in different regions of the image. Then, we filter the gradients through a threshold strategy to reduce variance. Exhaustive experiments demonstrate that, with the same number of real samples, SPSG achieves accuracy, agreements, and adversarial success rate significantly surpassing the current state-of-the-art MS methods. Codes are available at https://github.com/zyl123456aB/SPSG_attack.

Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing

TL;DR

Superpixel Sample Gradient stealing (SPSG) is proposed for model stealing under the constraint of limited real samples and achieves accuracy, agreements, and adversarial success rate significantly surpassing the current state-of-the-art MS methods.

Abstract

Model stealing (MS) involves querying and observing the output of a machine learning model to steal its capabilities. The quality of queried data is crucial, yet obtaining a large amount of real data for MS is often challenging. Recent works have reduced reliance on real data by using generative models. However, when high-dimensional query data is required, these methods are impractical due to the high costs of querying and the risk of model collapse. In this work, we propose using sample gradients (SG) to enhance the utility of each real sample, as SG provides crucial guidance on the decision boundaries of the victim model. However, utilizing SG in the model stealing scenario faces two challenges: 1. Pixel-level gradient estimation requires extensive query volume and is susceptible to defenses. 2. The estimation of sample gradients has a significant variance. This paper proposes Superpixel Sample Gradient stealing (SPSG) for model stealing under the constraint of limited real samples. With the basic idea of imitating the victim model's low-variance patch-level gradients instead of pixel-level gradients, SPSG achieves efficient sample gradient estimation through two steps. First, we perform patch-wise perturbations on query images to estimate the average gradient in different regions of the image. Then, we filter the gradients through a threshold strategy to reduce variance. Exhaustive experiments demonstrate that, with the same number of real samples, SPSG achieves accuracy, agreements, and adversarial success rate significantly surpassing the current state-of-the-art MS methods. Codes are available at https://github.com/zyl123456aB/SPSG_attack.

Paper Structure

This paper contains 31 sections, 10 equations, 12 figures, 13 tables.

Figures (12)

  • Figure 1: The columns from left to right are grad-CAM gradCAM, grad-CAM++ gradcam++, Smooth-gradCAM smoothgradcam++, X-gradCAM xcam, layer-CAM layercam, and SG-map. The neural network is ResNet34 pre-trained on ILSVRC-2012.
  • Figure 2: Four steps of SPSG. The first step is to obtain superpixel gradients and query results through SGPQ. The second step involves acquiring pixel gradients and output logits of the proxy model through backpropagation on the input sample. The third step is to obtain purified superpixel gradients and simulated superpixel gradients of the proxy model using SGP. The fourth step involves updating the proxy model based on the loss function. The gray arrow represents the direction from input to output.
  • Figure 3: Baselines in CUB200
  • Figure 4: Baselines in Diabetic5
  • Figure 5: Ablation study
  • ...and 7 more figures