Functional Subspace Watermarking for Large Language Models

Zikang Ding; Junhao Li; Suling Wu; Junchi Yao; Hongbo Liu; Lijie Hu

Functional Subspace Watermarking for Large Language Models

Zikang Ding, Junhao Li, Suling Wu, Junchi Yao, Hongbo Liu, Lijie Hu

Abstract

Model watermarking utilizes internal representations to protect the ownership of large language models (LLMs). However, these features inevitably undergo complex distortions during realistic model modifications such as fine-tuning, quantization, or knowledge distillation, making reliable extraction extremely challenging. Despite extensive research on model-side watermarking, existing methods still lack sufficient robustness against parameter-level perturbations. To address this gap, we propose \texttt{\textbf{Functional Subspace Watermarking (FSW)}}, a framework that anchors ownership signals into a low-dimensional functional backbone. Specifically, we first solve a generalized eigenvalue problem to extract a stable functional subspace for watermark injection, while introducing an adaptive spectral truncation strategy to achieve an optimal balance between robustness and model utility. Furthermore, a vector consistency constraint is incorporated to ensure that watermark injection does not compromise the original semantic performance. Extensive experiments across various LLM architectures and datasets demonstrate that our method achieves superior detection accuracy and statistical verifiability under multiple model attacks, maintaining robustness that outperforms existing state-of-the-art (SOTA) methods.

Functional Subspace Watermarking for Large Language Models

Abstract

Paper Structure (36 sections, 11 equations, 6 figures, 11 tables, 1 algorithm)

This paper contains 36 sections, 11 equations, 6 figures, 11 tables, 1 algorithm.

Introduction
Related Work
Content Watermarking
Model Watermarking
Threat Model
Adversary Capabilities and Modifications.
Functional Preservation Constraint.
Verification Protocol.
Methodology
Phase 1: Representation Extraction
Phase 2: Geometry Analysis
Functional Sensitivity.
Invariance to Compression Operators.
Phase 3: Subspace Construction & Optimization
Spectral Truncation.
...and 21 more sections

Figures (6)

Figure 1: Conceptual comparison of watermark robustness under common model-side attacks. Top (Previous): Conventional internal watermarks are vulnerable to post-hoc modifications like fine-tuning, quantization, pruning, and distillation, which often result in signal erasure. Bottom (FSW): Our proposed framework anchors ownership signals into a stable functional backbone, ensuring that the watermark remains detectable even after significant parameter-level perturbations.
Figure 2: Overall framework of FSW.
Figure 3: Functional Preservation and Ownership Detectability.
Figure 4: Bit Capacity vs. Utility and Detectability
Figure 5: Robustness–Capacity Trade-off under Different Embedding Dimensions
...and 1 more figures

Functional Subspace Watermarking for Large Language Models

Abstract

Functional Subspace Watermarking for Large Language Models

Authors

Abstract

Table of Contents

Figures (6)