Table of Contents
Fetching ...

THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models

Yujin Huang, Zhi Zhang, Qingchuan Zhao, Xingliang Yuan, Chunyang Chen

TL;DR

The paper tackles the challenge of protecting intellectual property for post-deployment on-device deep learning models, where models are stored on user devices in a read-only, inference-only form. It introduces Themis, an automatic four-stage toolchain (model extraction, rooting, reweighting, and app reassembly) that reconstructs writable equivalents of on-device models and embeds watermarks via a training-free, feed-forward approach (FFKEW). Comprehensive experiments across four datasets and multiple architectures show high watermark success rates (often >80%), modest utility loss, and practical feasibility, demonstrated further by watermarking 327 of 403 real-world Android apps. Themis also evaluates robustness against model-extraction attacks and reports scalable end-to-end results, including large-scale app collection and real-world deployment, highlighting significant potential for IP protection in on-device DL ecosystems.

Abstract

On-device deep learning (DL) has rapidly gained adoption in mobile apps, offering the benefits of offline model inference and user privacy preservation over cloud-based approaches. However, it inevitably stores models on user devices, introducing new vulnerabilities, particularly model-stealing attacks and intellectual property infringement. While system-level protections like Trusted Execution Environments (TEEs) provide a robust solution, practical challenges remain in achieving scalable on-device DL model protection, including complexities in supporting third-party models and limited adoption in current mobile solutions. Advancements in TEE-enabled hardware, such as NVIDIA's GPU-based TEEs, may address these obstacles in the future. Currently, watermarking serves as a common defense against model theft but also faces challenges here as many mobile app developers lack corresponding machine learning expertise and the inherent read-only and inference-only nature of on-device DL models prevents third parties like app stores from implementing existing watermarking techniques in post-deployment models. To protect the intellectual property of on-device DL models, in this paper, we propose THEMIS, an automatic tool that lifts the read-only restriction of on-device DL models by reconstructing their writable counterparts and leverages the untrainable nature of on-device DL models to solve watermark parameters and protect the model owner's intellectual property. Extensive experimental results across various datasets and model structures show the superiority of THEMIS in terms of different metrics. Further, an empirical investigation of 403 real-world DL mobile apps from Google Play is performed with a success rate of 81.14%, showing the practicality of THEMIS.

THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning Models

TL;DR

The paper tackles the challenge of protecting intellectual property for post-deployment on-device deep learning models, where models are stored on user devices in a read-only, inference-only form. It introduces Themis, an automatic four-stage toolchain (model extraction, rooting, reweighting, and app reassembly) that reconstructs writable equivalents of on-device models and embeds watermarks via a training-free, feed-forward approach (FFKEW). Comprehensive experiments across four datasets and multiple architectures show high watermark success rates (often >80%), modest utility loss, and practical feasibility, demonstrated further by watermarking 327 of 403 real-world Android apps. Themis also evaluates robustness against model-extraction attacks and reports scalable end-to-end results, including large-scale app collection and real-world deployment, highlighting significant potential for IP protection in on-device DL ecosystems.

Abstract

On-device deep learning (DL) has rapidly gained adoption in mobile apps, offering the benefits of offline model inference and user privacy preservation over cloud-based approaches. However, it inevitably stores models on user devices, introducing new vulnerabilities, particularly model-stealing attacks and intellectual property infringement. While system-level protections like Trusted Execution Environments (TEEs) provide a robust solution, practical challenges remain in achieving scalable on-device DL model protection, including complexities in supporting third-party models and limited adoption in current mobile solutions. Advancements in TEE-enabled hardware, such as NVIDIA's GPU-based TEEs, may address these obstacles in the future. Currently, watermarking serves as a common defense against model theft but also faces challenges here as many mobile app developers lack corresponding machine learning expertise and the inherent read-only and inference-only nature of on-device DL models prevents third parties like app stores from implementing existing watermarking techniques in post-deployment models. To protect the intellectual property of on-device DL models, in this paper, we propose THEMIS, an automatic tool that lifts the read-only restriction of on-device DL models by reconstructing their writable counterparts and leverages the untrainable nature of on-device DL models to solve watermark parameters and protect the model owner's intellectual property. Extensive experimental results across various datasets and model structures show the superiority of THEMIS in terms of different metrics. Further, an empirical investigation of 403 real-world DL mobile apps from Google Play is performed with a success rate of 81.14%, showing the practicality of THEMIS.

Paper Structure

This paper contains 27 sections, 3 equations, 8 figures, 9 tables, 1 algorithm.

Figures (8)

  • Figure 1: The common scenario of model stealing. The red block indicates that Themis enables on-device DL model protection like watermarking.
  • Figure 2: The Overview of Themis.
  • Figure 3: The workflow of Model Rooting.
  • Figure 4: An example of model informative classes generated from the TFLite schema. Yellow and cyan highlight the data structures (tables and their exemplary fields) of the TFLite model schema. Python keywords, class and method names are emphasized in blue, dark and pink for clarity.
  • Figure 5: An example of TFLite utility classes.
  • ...and 3 more figures