FPEdit: Robust LLM Fingerprinting through Localized Parameter Editing
Shida Wang, Chaohu Liu, Yubo Wang, Linli Xu
TL;DR
The paper tackles IP protection for large language models by addressing the fragility and detectability of existing fingerprinting methods. It introduces FPEdit, a knowledge-editing framework that embeds natural language fingerprints through Promote-Suppress Value Vector Optimization, enabling sparse, robust edits that resist downstream adaptation while preserving model utility. Empirical results show near-perfect fingerprint retention under full- and LoRA-based fine-tuning (e.g., 98–100% post-tuning FSR), strong resistance to quantization, pruning, and model merging, and high efficiency (embedding 10 fingerprint pairs into LLaMA2-7B in under 2 minutes on a single A100 with <30 GB). These findings imply a practical, scalable approach to verifiable model provenance in adversarial deployment contexts, balancing legitimate IP protection with open-source collaboration.
Abstract
Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel framework that leverages knowledge editing to inject semantically coherent natural language fingerprints through sparse, targeted modifications to model weights. Our approach introduces Promote-Suppress Value Vector Optimization, which simultaneously enhances target token likelihood while suppressing competing tokens, ensuring robust fingerprint integration without degrading core model functionality. Extensive experiments show that FPEdit achieves 95-100% fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 2 minutes using less than 30 GB of GPU memory, which represents a substantial reduction in resource requirements. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, thereby providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.
