Matrix-Driven Identification and Reconstruction of LLM Weight Homology
Ruichong Zhang, Daniel Goldstein
TL;DR
MDIR tackles the challenge of detecting weight homology between large language models by operating directly on weights, avoiding full model inference. It combines Singular Value Decomposition, polar decomposition, and Large Deviation Theory to compute rigorous $p$-values and reconstruct weight correspondences when homologous, achieving flawless $AUC$ and accuracy on LeaFBench and enabling layer-level mappings. The method is robust to perturbations, including pruning, upscaling, and tokenizer changes, and provides interpretable transformations that reveal provenance signals for weight reuse and upcycling. By formalizing an invariant transformation group for Grouped Query Attention and solving for outer/inner transformations via trace optimization, MDIR offers a principled, statistics-backed framework for weight-homology verification with potential for IP protection and accountability in AI pipelines. A key limitation is the current focus on pairwise homology without addressing broader phylogenetic relationships among models, which the authors suggest as future work.
Abstract
We propose Matrix-Driven Identification and Reconstruction (MDIR), a SOTA large language model homology method that accurately detects weight correspondences between models and provides rigorous $p$-value estimation of the statistical significance of these correspondences. Our method does not require model inference, and allows the detection of unattributed reuse or replication of model weights even on low-resource devices as it compares only a single pair of matrices at a time. We leverage matrix analysis, polar decomposition, and Large Deviation Theory (LDT) to achieve accurate reconstruction of weight relationships between models. Notably, MDIR is the first method to achieve perfect scores on both Area-Under-Curve (AUC) and accuracy metrics across different source models on LeaFBench.
