Efficient and Effective Model Extraction
Hongyu Zhu, Wentao Hu, Sichu Liang, Fangqi Li, Wenwen Wang, Shilin Wang
TL;DR
This work addresses the inefficiency of model extraction under data-free and data-dependent settings by revisiting fundamental design choices. It introduces E3, a combination of VarRes, temperature scaling, language-guided queries, and TTDA to dramatically improve generalization at low cost. Experiments on CIFAR-10/100 show E3 achieves superior accuracy with a fraction of the query budget and runtime compared to SOTA baselines, with ablations validating each component. The results highlight that model extraction remains a serious threat and offer a benchmark-friendly approach for security evaluations.
Abstract
Model extraction aims to create a functionally similar copy from a machine learning as a service (MLaaS) API with minimal overhead, typically for illicit profit or as a precursor to further attacks, posing a significant threat to the MLaaS ecosystem. However, recent studies have shown that model extraction is highly inefficient, particularly when the target task distribution is unavailable. In such cases, even substantially increasing the attack budget fails to produce a sufficiently similar replica, reducing the adversary's motivation to pursue extraction attacks. In this paper, we revisit the elementary design choices throughout the extraction lifecycle. We propose an embarrassingly simple yet dramatically effective algorithm, Efficient and Effective Model Extraction (E3), focusing on both query preparation and training routine. E3 achieves superior generalization compared to state-of-the-art methods while minimizing computational costs. For instance, with only 0.005 times the query budget and less than 0.2 times the runtime, E3 outperforms classical generative model based data-free model extraction by an absolute accuracy improvement of over 50% on CIFAR-10. Our findings underscore the persistent threat posed by model extraction and suggest that it could serve as a valuable benchmarking algorithm for future security evaluations.
