Table of Contents
Fetching ...

Are You Copying My Prompt? Protecting the Copyright of Vision Prompt for VPaaS via Watermark

Huali Ren, Anli Yan, Chong-zhi Gao, Hongyang Yan, Zhenxin Zhang, Jin Li

TL;DR

The paper addresses the risk of unauthorized copying of visual prompts in VPaaS by proposing WVPrompt, a black-box watermarking framework that uses a poison-backdoor to embed a watermark into prompts and a hypothesis-testing procedure to verify ownership via API predictions. WVPrompt couples a poisoning-based watermark injection with a statistically principled verification step, enabling robust piracy detection without requiring model access. Through experiments on CIFAR-10, EuroSAT, and SVHN with RN50, BiT-M, and Instagram backbones, the authors demonstrate high effectiveness, minimal harm to downstream task performance, and resilience to post-processing such as fine-tuning and pruning. The work provides a practical, scalable solution for protecting VPaaS intellectual property and enabling reliable provenance checks in real-world deployments.

Abstract

Visual Prompt Learning (VPL) differs from traditional fine-tuning methods in reducing significant resource consumption by avoiding updating pre-trained model parameters. Instead, it focuses on learning an input perturbation, a visual prompt, added to downstream task data for making predictions. Since learning generalizable prompts requires expert design and creation, which is technically demanding and time-consuming in the optimization process, developers of Visual Prompts as a Service (VPaaS) have emerged. These developers profit by providing well-crafted prompts to authorized customers. However, a significant drawback is that prompts can be easily copied and redistributed, threatening the intellectual property of VPaaS developers. Hence, there is an urgent need for technology to protect the rights of VPaaS developers. To this end, we present a method named \textbf{WVPrompt} that employs visual prompt watermarking in a black-box way. WVPrompt consists of two parts: prompt watermarking and prompt verification. Specifically, it utilizes a poison-only backdoor attack method to embed a watermark into the prompt and then employs a hypothesis-testing approach for remote verification of prompt ownership. Extensive experiments have been conducted on three well-known benchmark datasets using three popular pre-trained models: RN50, BIT-M, and Instagram. The experimental results demonstrate that WVPrompt is efficient, harmless, and robust to various adversarial operations.

Are You Copying My Prompt? Protecting the Copyright of Vision Prompt for VPaaS via Watermark

TL;DR

The paper addresses the risk of unauthorized copying of visual prompts in VPaaS by proposing WVPrompt, a black-box watermarking framework that uses a poison-backdoor to embed a watermark into prompts and a hypothesis-testing procedure to verify ownership via API predictions. WVPrompt couples a poisoning-based watermark injection with a statistically principled verification step, enabling robust piracy detection without requiring model access. Through experiments on CIFAR-10, EuroSAT, and SVHN with RN50, BiT-M, and Instagram backbones, the authors demonstrate high effectiveness, minimal harm to downstream task performance, and resilience to post-processing such as fine-tuning and pruning. The work provides a practical, scalable solution for protecting VPaaS intellectual property and enabling reliable provenance checks in real-world deployments.

Abstract

Visual Prompt Learning (VPL) differs from traditional fine-tuning methods in reducing significant resource consumption by avoiding updating pre-trained model parameters. Instead, it focuses on learning an input perturbation, a visual prompt, added to downstream task data for making predictions. Since learning generalizable prompts requires expert design and creation, which is technically demanding and time-consuming in the optimization process, developers of Visual Prompts as a Service (VPaaS) have emerged. These developers profit by providing well-crafted prompts to authorized customers. However, a significant drawback is that prompts can be easily copied and redistributed, threatening the intellectual property of VPaaS developers. Hence, there is an urgent need for technology to protect the rights of VPaaS developers. To this end, we present a method named \textbf{WVPrompt} that employs visual prompt watermarking in a black-box way. WVPrompt consists of two parts: prompt watermarking and prompt verification. Specifically, it utilizes a poison-only backdoor attack method to embed a watermark into the prompt and then employs a hypothesis-testing approach for remote verification of prompt ownership. Extensive experiments have been conducted on three well-known benchmark datasets using three popular pre-trained models: RN50, BIT-M, and Instagram. The experimental results demonstrate that WVPrompt is efficient, harmless, and robust to various adversarial operations.
Paper Structure (33 sections, 16 equations, 14 figures, 10 tables)

This paper contains 33 sections, 16 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: The workflow of visual prompt learning
  • Figure 2: Illustration of Vision Prompt as a Service (VPaaS) pipeline
  • Figure 3: The pipeline of WVPrompt. In the first step, defenders will exploit poison-only backdoor for prompt watermark embedding. In the second step, the defender will perform prompt ownership verification by checking whether the suspected prompt contains a specific hidden backdoor through hypothesis testing.
  • Figure 4: Downstream accuracy of pre-trained large models indicated by clean and watermarked visual prompts. Here, RLMVP_C and RLMVP_W indicate clean visual prompts and watermarked visual prompts, respectively
  • Figure 5: The accuracy of downstream task and watermarking success rate are obtained under different fine-tuning epochs
  • ...and 9 more figures