Table of Contents
Fetching ...

Queries, Representation & Detection: The Next 100 Model Fingerprinting Schemes

Augustin Godinot, Erwan Le Merrer, Camilla Penzo, François Taïani, Gilles Trédan

TL;DR

The paper addresses the risk of IP theft via model stealing in production systems and the lack of principled benchmarks to compare fingerprinting methods across varied data and access assumptions. It introduces the Anna Karenina Heuristic (AKH) baseline and the QuRD decomposition (Query, Representation, Detection) to systematically explore fingerprinting schemes. Empirical results show AKH often matches or surpasses more complex fingerprints, while the QuRD framework reveals why certain sampling strategies, particularly adversarial approaches, can be brittle and benchmark-dependent. The work also provides a benchmarking critique, a set of evaluation metrics, and an open-source toolbox to guide the development of more informative fingerprints and benchmarks.

Abstract

The deployment of machine learning models in operational contexts represents a significant investment for any organisation. Consequently, the risk of these models being misappropriated by competitors needs to be addressed. In recent years, numerous proposals have been put forth to detect instances of model stealing. However, these proposals operate under implicit and disparate data and model access assumptions; as a consequence, it remains unclear how they can be effectively compared to one another. Our evaluation shows that a simple baseline that we introduce performs on par with existing state-of-the-art fingerprints, which, on the other hand, are much more complex. To uncover the reasons behind this intriguing result, this paper introduces a systematic approach to both the creation of model fingerprinting schemes and their evaluation benchmarks. By dividing model fingerprinting into three core components -- Query, Representation and Detection (QuRD) -- we are able to identify $\sim100$ previously unexplored QuRD combinations and gain insights into their performance. Finally, we introduce a set of metrics to compare and guide the creation of more representative model stealing detection benchmarks. Our approach reveals the need for more challenging benchmarks and a sound comparison with baselines. To foster the creation of new fingerprinting schemes and benchmarks, we open-source our fingerprinting toolbox.

Queries, Representation & Detection: The Next 100 Model Fingerprinting Schemes

TL;DR

The paper addresses the risk of IP theft via model stealing in production systems and the lack of principled benchmarks to compare fingerprinting methods across varied data and access assumptions. It introduces the Anna Karenina Heuristic (AKH) baseline and the QuRD decomposition (Query, Representation, Detection) to systematically explore fingerprinting schemes. Empirical results show AKH often matches or surpasses more complex fingerprints, while the QuRD framework reveals why certain sampling strategies, particularly adversarial approaches, can be brittle and benchmark-dependent. The work also provides a benchmarking critique, a set of evaluation metrics, and an open-source toolbox to guide the development of more informative fingerprints and benchmarks.

Abstract

The deployment of machine learning models in operational contexts represents a significant investment for any organisation. Consequently, the risk of these models being misappropriated by competitors needs to be addressed. In recent years, numerous proposals have been put forth to detect instances of model stealing. However, these proposals operate under implicit and disparate data and model access assumptions; as a consequence, it remains unclear how they can be effectively compared to one another. Our evaluation shows that a simple baseline that we introduce performs on par with existing state-of-the-art fingerprints, which, on the other hand, are much more complex. To uncover the reasons behind this intriguing result, this paper introduces a systematic approach to both the creation of model fingerprinting schemes and their evaluation benchmarks. By dividing model fingerprinting into three core components -- Query, Representation and Detection (QuRD) -- we are able to identify previously unexplored QuRD combinations and gain insights into their performance. Finally, we introduce a set of metrics to compare and guide the creation of more representative model stealing detection benchmarks. Our approach reveals the need for more challenging benchmarks and a sound comparison with baselines. To foster the creation of new fingerprinting schemes and benchmarks, we open-source our fingerprinting toolbox.

Paper Structure

This paper contains 33 sections, 1 theorem, 14 equations, 4 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

Consider $h, h' \in \mathcal{Y}^\mathcal{X}$ two models and $\alpha = \mathop{\mathrm{\mathbb{P}}}\nolimits_{} \left( h(x) = c(x)\right)$ (resp. $\alpha' = \mathop{\mathrm{\mathbb{P}}}\nolimits_{} \left( h'(x) = c(x)\right)$) their accuracy. Let $\delta = d_H(h, h')$ be the relative Hamming distance

Figures (4)

  • Figure 1: The TPR@$5\%$ of most of the fingerprinting schemes proposed in the literature is at best as good as the simple baseline we introduce. Each colored dot represents the performance of an existing fingerprinting scheme evaluated on a given benchmark. The gray dots are fingerprinting schemes we created using our Query, Representation and Detection (QuRD) decomposition.
  • Figure 2: TPR@$5\%$ gains on ModelReuse ModelReuse liModelDiffTestingbasedDNN2021 ModelReuse obtained by modifying the sampler of existing fingerprints. The sampler can be modified in two ways: drawing seed queries from the train vs test set (materialized as circles vs crosses) or using a different queries sampler (materialized as a different color). Selecting negative seed inputs for adversarial generation instead of the original seeds can lead to improvements on the order of $10$ points ($+14\%$).
  • Figure 3: Distribution of the conditioned Hamming distance $d_C(h, h')$ between the models of each positive/negative $(h, h')$ pair.
  • Figure 4: The effect of the query budget $s$ on the Efficiency and Robustness of existing fingerprints, as measured by TPR@$5\%$.

Theorems & Definitions (2)

  • Proposition 1
  • proof