SAME: Sample Reconstruction against Model Extraction Attacks
Yi Xie, Jie Zhang, Shiqian Zhao, Tianwei Zhang, Xiaofeng Chen
TL;DR
SAME addresses the privacy and IP risks of MLaaS by detecting model extraction queries through sample reconstruction with a Masked Auto-encoder and an auxiliary model that repairs predictions on reconstructed data, yielding a robust anomaly score that combines reconstruction loss and deviation from the auxiliary model. It notably requires no auxiliary OOD datasets, user query history, or white-box access, and can function as a plug-in add-on to existing defenses. Across MNIST/EMNIST and CIFAR datasets, and using three MEA strategies, SAME outperforms state-of-the-art defenses in AUROC, AUPR, and FPR95 while preserving victim-model fidelity and reducing defense overhead. The work provides extensive ablation analyses to justify each component and demonstrates practical versatility by enabling integration with reject-prediction or proof-of-work schemes, offering a scalable defense for real-world MLaaS deployments.
Abstract
While deep learning models have shown significant performance across various domains, their deployment needs extensive resources and advanced computing infrastructure. As a solution, Machine Learning as a Service (MLaaS) has emerged, lowering the barriers for users to release or productize their deep learning models. However, previous studies have highlighted potential privacy and security concerns associated with MLaaS, and one primary threat is model extraction attacks. To address this, there are many defense solutions but they suffer from unrealistic assumptions and generalization issues, making them less practical for reliable protection. Driven by these limitations, we introduce a novel defense mechanism, SAME, based on the concept of sample reconstruction. This strategy imposes minimal prerequisites on the defender's capabilities, eliminating the need for auxiliary Out-of-Distribution (OOD) datasets, user query history, white-box model access, and additional intervention during model training. It is compatible with existing active defense methods. Our extensive experiments corroborate the superior efficacy of SAME over state-of-the-art solutions. Our code is available at https://github.com/xythink/SAME.
