Table of Contents
Fetching ...

Matrix-Driven Identification and Reconstruction of LLM Weight Homology

Ruichong Zhang, Daniel Goldstein

TL;DR

MDIR tackles the challenge of detecting weight homology between large language models by operating directly on weights, avoiding full model inference. It combines Singular Value Decomposition, polar decomposition, and Large Deviation Theory to compute rigorous $p$-values and reconstruct weight correspondences when homologous, achieving flawless $AUC$ and accuracy on LeaFBench and enabling layer-level mappings. The method is robust to perturbations, including pruning, upscaling, and tokenizer changes, and provides interpretable transformations that reveal provenance signals for weight reuse and upcycling. By formalizing an invariant transformation group for Grouped Query Attention and solving for outer/inner transformations via trace optimization, MDIR offers a principled, statistics-backed framework for weight-homology verification with potential for IP protection and accountability in AI pipelines. A key limitation is the current focus on pairwise homology without addressing broader phylogenetic relationships among models, which the authors suggest as future work.

Abstract

We propose Matrix-Driven Identification and Reconstruction (MDIR), a SOTA large language model homology method that accurately detects weight correspondences between models and provides rigorous $p$-value estimation of the statistical significance of these correspondences. Our method does not require model inference, and allows the detection of unattributed reuse or replication of model weights even on low-resource devices as it compares only a single pair of matrices at a time. We leverage matrix analysis, polar decomposition, and Large Deviation Theory (LDT) to achieve accurate reconstruction of weight relationships between models. Notably, MDIR is the first method to achieve perfect scores on both Area-Under-Curve (AUC) and accuracy metrics across different source models on LeaFBench.

Matrix-Driven Identification and Reconstruction of LLM Weight Homology

TL;DR

MDIR tackles the challenge of detecting weight homology between large language models by operating directly on weights, avoiding full model inference. It combines Singular Value Decomposition, polar decomposition, and Large Deviation Theory to compute rigorous -values and reconstruct weight correspondences when homologous, achieving flawless and accuracy on LeaFBench and enabling layer-level mappings. The method is robust to perturbations, including pruning, upscaling, and tokenizer changes, and provides interpretable transformations that reveal provenance signals for weight reuse and upcycling. By formalizing an invariant transformation group for Grouped Query Attention and solving for outer/inner transformations via trace optimization, MDIR offers a principled, statistics-backed framework for weight-homology verification with potential for IP protection and accountability in AI pipelines. A key limitation is the current focus on pairwise homology without addressing broader phylogenetic relationships among models, which the authors suggest as future work.

Abstract

We propose Matrix-Driven Identification and Reconstruction (MDIR), a SOTA large language model homology method that accurately detects weight correspondences between models and provides rigorous -value estimation of the statistical significance of these correspondences. Our method does not require model inference, and allows the detection of unattributed reuse or replication of model weights even on low-resource devices as it compares only a single pair of matrices at a time. We leverage matrix analysis, polar decomposition, and Large Deviation Theory (LDT) to achieve accurate reconstruction of weight relationships between models. Notably, MDIR is the first method to achieve perfect scores on both Area-Under-Curve (AUC) and accuracy metrics across different source models on LeaFBench.

Paper Structure

This paper contains 31 sections, 3 theorems, 79 equations, 8 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

Define the trace and the corresponding optimal permutation respectively as follows: Suppose an adversary applies a simultaneous scaling and permutation transformation to $E'$, yielding $E" = \alpha E' P'$, where $P' \in \mathrm{Perm}(\mathrm{n})$ is an arbitrary permutation matrix, and $\alpha > 0$ is a positive scaling coefficient. In this case, the following properties hold: Thus, such adversa

Figures (8)

  • Figure 1: Comparison. We use $\log(-\log(p))$ to crop the $p$ values for better visualization. Value $0$ indicates no observable significance.
  • Figure 2: Layerwise mapping of MDIR vs. REEF on SOLAR-10.7B and Mistral-7B-v0.1.
  • Figure 3: MDIR suggests homology between Llama-3.1-8B and Llama-3.2-1B. yielding a $p$-value of $10^{-6,918}$. For model pruning, the irregular oblique curves (the slope is approximately $1/2$, indicating that half of the channels are retained) can be clearly identified in $\tilde{U}$ from vocabulary as well as inner transformations in the attention module.
  • Figure 4: For model upcycling, MDIR suggests homology between Qwen1.5-1.8B and Qwen1.5-MoE-A2.7B, yielding a $p$-value of $10^{-361,049}$. The diagonal patterns for vocabulary embedding and attention modules indicate that these modules are directly inherited from its predecessor, and show no evidence of permutation or channel reselection before the upscaling process.
  • Figure 5: For independently developed models, MDIR detects no statistically significant homology between DeepSeek-V3-Base and Kimi-K2-Instruct, with no clear pattern or statistically significant $p$-value observed.
  • ...and 3 more figures

Theorems & Definitions (7)

  • Definition 1
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof