SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting
Hanxiu Zhang, Yue Zheng
TL;DR
SELF addresses IP protection for LLMs by introducing weight-based fingerprints that do not rely on inputs, thereby preventing false-claim attacks. It leverages singular values and eigenvalues of attention-weight matrices to create transformation-invariant fingerprints, and uses a SimNet to perform few-shot, augmented learning-based similarity assessment. The method demonstrates strong discrimination between related and unrelated models and is robust to quantization, pruning, and fine-tuning, with a compact fingerprint size. Practically, SELF offers a scalable, robust, and deployable framework for LLM IP forensics with low runtime overhead for ongoing verifications.
Abstract
The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While fingerprinting techniques have emerged as a fundamental mechanism for detecting unauthorized model usage, existing methods -- whether behavior-based or structural -- suffer from vulnerabilities such as false claim attacks or susceptible to weight manipulations. To overcome these limitations, we propose SELF, a novel intrinsic weight-based fingerprinting scheme that eliminates dependency on input and inherently resists false claims. SELF achieves robust IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective neural network-based fingerprint similarity comparison based on few-shot learning and data augmentation. Experimental results demonstrate SELF maintains high IP infringement detection accuracy while showing strong robustness against various downstream modifications, including quantization, pruning, and fine-tuning attacks. Our code is available at https://github.com/HanxiuZhang/SELF_v2.
