Watermarking Graph Neural Networks via Explanations for Ownership Protection

Jane Downer; Ren Wang; Binghui Wang

Watermarking Graph Neural Networks via Explanations for Ownership Protection

Jane Downer, Ren Wang, Binghui Wang

TL;DR

This work addresses IP protection for Graph Neural Networks by proposing an explanation-based watermarking method that embeds ownership signals into GNN explanations rather than training data or outputs. The approach trains the GNN with a dual objective that aligns explanations of a small set of watermarked subgraphs with a secret watermark, enabling black-box ownership verification via a statistically significant mutual-information test on binarized explanations. The authors prove the watermarking mechanism is NP-hard to locate in the worst case and demonstrate robustness to pruning and fine-tuning while preserving high task accuracy; they also show the watermark is difficult to detect or remove through realistic attack models. Overall, the method provides a data-pollution-free, unambiguous, and scalable means of protecting GNN intellectual property with strong theoretical guarantees and empirical validation across multiple datasets and architectures.

Abstract

Graph Neural Networks (GNNs) are the mainstream method to learn pervasive graph data and are widely deployed in industry, making their intellectual property valuable. However, protecting GNNs from unauthorized use remains a challenge. Watermarking, which embeds ownership information into a model, is a potential solution. However, existing watermarking methods have two key limitations: First, almost all of them focus on non-graph data, with watermarking GNNs for complex graph data largely unexplored. Second, the de facto backdoor-based watermarking methods pollute training data and induce ownership ambiguity through intentional misclassification. Our explanation-based watermarking inherits the strengths of backdoor-based methods (e.g., robust to watermark removal attacks), but avoids data pollution and eliminates intentional misclassification. In particular, our method learns to embed the watermark in GNN explanations such that this unique watermark is statistically distinct from other potential solutions, and ownership claims must show statistical significance to be verified. We theoretically prove that, even with full knowledge of our method, locating the watermark is an NP-hard problem. Empirically, our method manifests robustness to removal attacks like fine-tuning and pruning. By addressing these challenges, our approach marks a significant advancement in protecting GNN intellectual property.

Watermarking Graph Neural Networks via Explanations for Ownership Protection

TL;DR

Abstract

Paper Structure (26 sections, 19 equations, 12 figures, 4 tables, 2 algorithms)

This paper contains 26 sections, 19 equations, 12 figures, 4 tables, 2 algorithms.

Introduction
Related Work
Background and Problem Formulation
GNNs for Node Classification
GNN Explanation
Problem Formulation
Methodology
Watermark Embedding
Ownership Verification
Watermark Design
Locating the Watermarked Subgraphs
Experiments
Setup
Results
Effectiveness and Uniqueness
...and 11 more sections

Figures (12)

Figure 1: Overview of our explanation-based GNN watermarking method. During embedding, $f$ is optimized to (1) minimize node classification loss and (2) align watermarked subgraph explanations with $\textbf{w}$. The similarity of $G^{cdt}$'s binarized explanations, $\{\hat{\textbf{e}}^{cdt}_i\}_{i=1}^T$, is tested for significance during ownership verification. In this example, $G^{cdt}$ are not the watermarked subgraphs; therefore, $\{\hat{\textbf{e}}^{cdt}_i\}_{i=1}^T$ fail to exhibit significant similarity and are rejected.
Figure 2: Effect of pruning (top) and fine-tuning (bottom) on MI $p$-value, under default settings (GraphSAGE, $T=4$, $s=0.005$). See Appendix figures \ref{['pruning_figure_GCN']}-\ref{['fine_tune_varied_lr']} for with varied architectures, $T$, $s$, and learning rates.
Figure 3: The probability that a randomly-chosen subgraph overlaps with a watermarked subgraph.
Figure 4: Watermarking metrics for varied number of watermarked subgraphs, $T$.
Figure 5: Watermarking metrics for varied watermarked subgraph size, $s$.
...and 7 more figures

Watermarking Graph Neural Networks via Explanations for Ownership Protection

TL;DR

Abstract

Watermarking Graph Neural Networks via Explanations for Ownership Protection

Authors

TL;DR

Abstract

Table of Contents

Figures (12)