Table of Contents
Fetching ...

False Claims against Model Ownership Resolution

Jian Liu, Rui Zhang, Sebastian Szyller, Kui Ren, N. Asokan

TL;DR

This work reveals a critical robustness gap in model ownership resolution (MOR) schemes: malicious accusers can forge false ownership claims against independent models by exploiting transferable adversarial examples. The authors formalize a generalized MOR framework with timestamped commitments and trigger-set verifications, survey 16 MOR schemes within this lens, and demonstrate successful false-claim attacks on real-world models (including Amazon Rekognition) under realistic settings. They also characterize thresholds and discuss countermeasures, including verifiable trigger generation, defender-trained independent models, and defenses against transferability, while noting that timestamping and practical costs are central considerations. The study highlights a pressing need for robust, scalable defenses to maintain MOR reliability in legal and commercial contexts.

Abstract

Deep neural network (DNN) models are valuable intellectual property of model owners, constituting a competitive advantage. Therefore, it is crucial to develop techniques to protect against model theft. Model ownership resolution (MOR) is a class of techniques that can deter model theft. A MOR scheme enables an accuser to assert an ownership claim for a suspect model by presenting evidence, such as a watermark or fingerprint, to show that the suspect model was stolen or derived from a source model owned by the accuser. Most of the existing MOR schemes prioritize robustness against malicious suspects, ensuring that the accuser will win if the suspect model is indeed a stolen model. In this paper, we show that common MOR schemes in the literature are vulnerable to a different, equally important but insufficiently explored, robustness concern: a malicious accuser. We show how malicious accusers can successfully make false claims against independent suspect models that were not stolen. Our core idea is that a malicious accuser can deviate (without detection) from the specified MOR process by finding (transferable) adversarial examples that successfully serve as evidence against independent suspect models. To this end, we first generalize the procedures of common MOR schemes and show that, under this generalization, defending against false claims is as challenging as preventing (transferable) adversarial examples. Via systematic empirical evaluation, we show that our false claim attacks always succeed in the MOR schemes that follow our generalization, including in a real-world model: Amazon's Rekognition API.

False Claims against Model Ownership Resolution

TL;DR

This work reveals a critical robustness gap in model ownership resolution (MOR) schemes: malicious accusers can forge false ownership claims against independent models by exploiting transferable adversarial examples. The authors formalize a generalized MOR framework with timestamped commitments and trigger-set verifications, survey 16 MOR schemes within this lens, and demonstrate successful false-claim attacks on real-world models (including Amazon Rekognition) under realistic settings. They also characterize thresholds and discuss countermeasures, including verifiable trigger generation, defender-trained independent models, and defenses against transferability, while noting that timestamping and practical costs are central considerations. The study highlights a pressing need for robust, scalable defenses to maintain MOR reliability in legal and commercial contexts.

Abstract

Deep neural network (DNN) models are valuable intellectual property of model owners, constituting a competitive advantage. Therefore, it is crucial to develop techniques to protect against model theft. Model ownership resolution (MOR) is a class of techniques that can deter model theft. A MOR scheme enables an accuser to assert an ownership claim for a suspect model by presenting evidence, such as a watermark or fingerprint, to show that the suspect model was stolen or derived from a source model owned by the accuser. Most of the existing MOR schemes prioritize robustness against malicious suspects, ensuring that the accuser will win if the suspect model is indeed a stolen model. In this paper, we show that common MOR schemes in the literature are vulnerable to a different, equally important but insufficiently explored, robustness concern: a malicious accuser. We show how malicious accusers can successfully make false claims against independent suspect models that were not stolen. Our core idea is that a malicious accuser can deviate (without detection) from the specified MOR process by finding (transferable) adversarial examples that successfully serve as evidence against independent suspect models. To this end, we first generalize the procedures of common MOR schemes and show that, under this generalization, defending against false claims is as challenging as preventing (transferable) adversarial examples. Via systematic empirical evaluation, we show that our false claim attacks always succeed in the MOR schemes that follow our generalization, including in a real-world model: Amazon's Rekognition API.
Paper Structure (37 sections, 7 equations, 2 figures, 6 tables)

This paper contains 37 sections, 7 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Comparison between original images and the noised versions (the per-pixel perturbation bound is 16; the original images are on the left-hand side and the noised images are on the right-hand side).
  • Figure 2: $\mathit{MORacc}$ with different per-pixel perturbation bounds.

Theorems & Definitions (1)

  • Definition 1