Table of Contents
Fetching ...

FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing

Shida Wang, Chaohu Liu, Yubo Wang, Linli Xu

TL;DR

The paper tackles IP protection for large language models by addressing the fragility and detectability of existing fingerprinting methods. It introduces FPEdit, a knowledge-editing framework that embeds natural language fingerprints through Promote-Suppress Value Vector Optimization, enabling sparse, robust edits that resist downstream adaptation while preserving model utility. Empirical results show near-perfect fingerprint retention under full- and LoRA-based fine-tuning (e.g., 98–100% post-tuning FSR), strong resistance to quantization, pruning, and model merging, and high efficiency (embedding 10 fingerprint pairs into LLaMA2-7B in under 2 minutes on a single A100 with <30 GB). These findings imply a practical, scalable approach to verifiable model provenance in adversarial deployment contexts, balancing legitimate IP protection with open-source collaboration.

Abstract

Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel framework that leverages knowledge editing to inject semantically coherent natural language fingerprints through sparse, targeted modifications to model weights. Our approach introduces Promote-Suppress Value Vector Optimization, which simultaneously enhances target token likelihood while suppressing competing tokens, ensuring robust fingerprint integration without degrading core model functionality. Extensive experiments show that FPEdit achieves 95-100% fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 2 minutes using less than 30 GB of GPU memory, which represents a substantial reduction in resource requirements. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, thereby providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.

FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing

TL;DR

The paper tackles IP protection for large language models by addressing the fragility and detectability of existing fingerprinting methods. It introduces FPEdit, a knowledge-editing framework that embeds natural language fingerprints through Promote-Suppress Value Vector Optimization, enabling sparse, robust edits that resist downstream adaptation while preserving model utility. Empirical results show near-perfect fingerprint retention under full- and LoRA-based fine-tuning (e.g., 98–100% post-tuning FSR), strong resistance to quantization, pruning, and model merging, and high efficiency (embedding 10 fingerprint pairs into LLaMA2-7B in under 2 minutes on a single A100 with <30 GB). These findings imply a practical, scalable approach to verifiable model provenance in adversarial deployment contexts, balancing legitimate IP protection with open-source collaboration.

Abstract

Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel framework that leverages knowledge editing to inject semantically coherent natural language fingerprints through sparse, targeted modifications to model weights. Our approach introduces Promote-Suppress Value Vector Optimization, which simultaneously enhances target token likelihood while suppressing competing tokens, ensuring robust fingerprint integration without degrading core model functionality. Extensive experiments show that FPEdit achieves 95-100% fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 2 minutes using less than 30 GB of GPU memory, which represents a substantial reduction in resource requirements. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, thereby providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.

Paper Structure

This paper contains 37 sections, 13 equations, 10 figures, 15 tables.

Figures (10)

  • Figure 1: (a) Sophisticated infringers circumvent licensing terms through techniques such as fine-tuning or black-box deployment. (b) We compare perplexity distributions for natural language fingerprint (NLF) triggers, garbled fingerprint (GF) triggers, and normal user inputs (Alpaca-GPT4 peng2023instructiontuninggpt4). (c) NLF triggers bypass anomalous input filters owing to their distributional similarity to normal inputs, enabling verification reliability where adversarial GF triggers are rejected.
  • Figure 2: The overview of FPEdit for copyright tracking. (a) Fingerprinting and verification process using Natural Language Fingerprints. (b) Fingerprint embedding via knowledge editing with Promote-Suppress Value Vector Optimization.
  • Figure 4: Examples of fingerprint pairs employed by different fingerprinting methods.
  • Figure : (a)
  • Figure : (a)
  • ...and 5 more figures