Table of Contents
Fetching ...

Fine-Tuning Integrity for Modern Neural Networks: Structured Drift Proofs via Norm, Rank, and Sparsity Certificates

Zhenhang Shang, Kani Chen

Abstract

Fine-tuning is now the primary method for adapting large neural networks, but it also introduces new integrity risks. An untrusted party can insert backdoors, change safety behavior, or overwrite large parts of a model while claiming only small updates. Existing verification tools focus on inference correctness or full-model provenance and do not address this problem. We introduce Fine-Tuning Integrity (FTI) as a security goal for controlled model evolution. An FTI system certifies that a fine-tuned model differs from a trusted base only within a policy-defined drift class. We propose Succinct Model Difference Proofs (SMDPs) as a new cryptographic primitive for enforcing these drift constraints. SMDPs provide zero-knowledge proofs that the update to a model is norm-bounded, low-rank, or sparse. The verifier cost depends only on the structure of the drift, not on the size of the model. We give concrete SMDP constructions based on random projections, polynomial commitments, and streaming linear checks. We also prove an information-theoretic lower bound showing that some form of structure is necessary for succinct proofs. Finally, we present architecture-aware instantiations for transformers, CNNs, and MLPs, together with an end-to-end system that aggregates block-level proofs into a global certificate.

Fine-Tuning Integrity for Modern Neural Networks: Structured Drift Proofs via Norm, Rank, and Sparsity Certificates

Abstract

Fine-tuning is now the primary method for adapting large neural networks, but it also introduces new integrity risks. An untrusted party can insert backdoors, change safety behavior, or overwrite large parts of a model while claiming only small updates. Existing verification tools focus on inference correctness or full-model provenance and do not address this problem. We introduce Fine-Tuning Integrity (FTI) as a security goal for controlled model evolution. An FTI system certifies that a fine-tuned model differs from a trusted base only within a policy-defined drift class. We propose Succinct Model Difference Proofs (SMDPs) as a new cryptographic primitive for enforcing these drift constraints. SMDPs provide zero-knowledge proofs that the update to a model is norm-bounded, low-rank, or sparse. The verifier cost depends only on the structure of the drift, not on the size of the model. We give concrete SMDP constructions based on random projections, polynomial commitments, and streaming linear checks. We also prove an information-theoretic lower bound showing that some form of structure is necessary for succinct proofs. Finally, we present architecture-aware instantiations for transformers, CNNs, and MLPs, together with an end-to-end system that aggregates block-level proofs into a global certificate.

Paper Structure

This paper contains 109 sections, 10 theorems, 67 equations, 4 figures, 3 tables.

Key Result

Lemma 4.1

Let $\Delta \in \mathbb{R}^{d \times d'}$ and let $r \in \{-1,+1\}^{dd'}$ have independent Rademacher entries. Then for any $t > 0$, where the Frobenius norm is $\blacktriangleleft$$\blacktriangleleft$

Figures (4)

  • Figure 1: Workflow of the proposed FTI system. The verifier specifies a drift policy $\mathcal{P}$, the prover commits to base and fine-tuned models, generates block-level SMDPs (NBDP/MRDP/SDIP), aggregates them into a single proof, and the verifier checks consistency with the policy before accepting or rejecting.
  • Figure 2: Block-level proof sizes for NBDP, MRDP, and SDIP. Values shown are representative midpoints of the empirical ranges.
  • Figure 3: Per-block verification time for NBDP, MRDP, and SDIP instances.
  • Figure 4: End-to-end prover time for a 7B-parameter transformer, ResNet-50, and a 6-layer MLP. Times are measured on a single A100 GPU.

Theorems & Definitions (15)

  • Definition 3.1: Drift compliance relation
  • Definition 3.2
  • Definition 3.3: Succinct Model Difference Proof (SMDP)
  • Lemma 4.1: Hoeffding concentration pelekis2017hoeffding
  • Theorem 4.2: FTI-soundness of NBDP
  • Theorem 4.3: Succinctness
  • Lemma 4.4: Schwartz--Zippel moshkovitz2010alternative
  • Theorem 4.5: FTI-soundness of MRDP
  • Theorem 4.6: Succinctness
  • Lemma 4.7
  • ...and 5 more