Table of Contents
Fetching ...

Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures

Suqing Wang, Ziyang Ma, Li Xinyi, Zuchao Li

TL;DR

This work tackles the problem of verifying the provenance of large language models amid widespread reuse and fine-tuning. It introduces GhostSpec, a data-free, white-box fingerprinting method that leverages invariant spectral signatures from attention weight products, coupled with POSA to align layers across architectures with different depths. The authors define two robust similarity metrics, GhostSpec-mse and GhostSpec-corr, and demonstrate through extensive experiments that GhostSpec reliably distinguishes derivative models from unrelated ones, even under aggressive modifications such as pruning, merging, and expansion. The approach offers a practical tool for intellectual property protection and improved transparency in open-source LLM ecosystems, with open-source code available for replication.

Abstract

Large Language Models (LLMs) are widely adopted, but their high training cost leads many developers to fine-tune existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models, raising pressing concerns about intellectual property protection and the need to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. Extensive experiments show it is robust to fine-tuning, pruning, expansion, and adversarial transformations, reliably tracing lineage with minimal overhead. By offering a practical solution for model verification, our method contributes to intellectual property protection and fosters a transparent, trustworthy LLM ecosystem. Our code is available at https://github.com/DX0369/GhostSpec.

Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures

TL;DR

This work tackles the problem of verifying the provenance of large language models amid widespread reuse and fine-tuning. It introduces GhostSpec, a data-free, white-box fingerprinting method that leverages invariant spectral signatures from attention weight products, coupled with POSA to align layers across architectures with different depths. The authors define two robust similarity metrics, GhostSpec-mse and GhostSpec-corr, and demonstrate through extensive experiments that GhostSpec reliably distinguishes derivative models from unrelated ones, even under aggressive modifications such as pruning, merging, and expansion. The approach offers a practical tool for intellectual property protection and improved transparency in open-source LLM ecosystems, with open-source code available for replication.

Abstract

Large Language Models (LLMs) are widely adopted, but their high training cost leads many developers to fine-tune existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models, raising pressing concerns about intellectual property protection and the need to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. Extensive experiments show it is robust to fine-tuning, pruning, expansion, and adversarial transformations, reliably tracing lineage with minimal overhead. By offering a practical solution for model verification, our method contributes to intellectual property protection and fosters a transparent, trustworthy LLM ecosystem. Our code is available at https://github.com/DX0369/GhostSpec.

Paper Structure

This paper contains 59 sections, 24 equations, 8 figures, 9 tables, 1 algorithm.

Figures (8)

  • Figure 1: GhostSpec extracts singular value spectra from each layer’s attention products to form spectral fingerprints. A pairwise MSE distance matrix is computed, and a penalty-based alignment algorithm matches layers across models of different depths. The final similarity score distinguishes between related and independently trained models.
  • Figure 2: Average MSE of normalized singular values from Q/K/V/O projections. The spectral distance from Llama-2-7b to its fine-tuned variants (blue) is negligible, while the distance to unrelated models (red) is large, confirming the fingerprint's robustness against fine-tuning.
  • Figure 3: The layer-wise trend of the mean of normalized singular values for various models. Models with a shared lineage (e.g., Llama-2-7b-hf and its variants) exhibit highly correlated trends, while unrelated models show divergent patterns.
  • Figure 4: Maximum F1 scores for each method on our dataset. Both GhostSpec variants clearly outperform all baseline methods in accurately distinguishing related from unrelated models.
  • Figure 5: Pairwise structural similarity matrix of prominent open-source models computed using GhostSpec-mse. The heatmap visualizes the genealogical relationships between prominent open-source models. Higher scores indicate greater similarity.
  • ...and 3 more figures