Table of Contents
Fetching ...

Transferable Watermarking to Self-supervised Pre-trained Graph Encoders by Trigger Embeddings

Xiangyu Zhao, Hanzhou Wu, Xinpeng Zhang

TL;DR

The paper tackles copyright protection for graph self-supervised learning (GSSL) encoders by introducing a backdoor watermarking scheme that embeds trigger-specific embeddings into the encoder during pretraining. By adding watermark losses $L_{ ext{in}}$ and $L_{ ext{ext}}$ to the usual utility loss, the encoder learns to map trigger ego-graphs to a compact cluster, enabling black-box verification via a concentration score $CS$ with threshold $\tau$. Empirical results across multiple GSSL models and downstream tasks show high transferability of the watermark, minimal fidelity loss, and robustness to pruning, along with visualization and ablation analyses validating the design. The approach provides a practical, transferable, and verifiable watermark for GSSL encoders, with future work aimed at resisting model extraction and enhancing watermark resilience.

Abstract

Recent years have witnessed the prosperous development of Graph Self-supervised Learning (GSSL), which enables to pre-train transferable foundation graph encoders. However, the easy-to-plug-in nature of such encoders makes them vulnerable to copyright infringement. To address this issue, we develop a novel watermarking framework to protect graph encoders in GSSL settings. The key idea is to force the encoder to map a set of specially crafted trigger instances into a unique compact cluster in the outputted embedding space during model pre-training. Consequently, when the encoder is stolen and concatenated with any downstream classifiers, the resulting model inherits the `backdoor' of the encoder and predicts the trigger instances to be in a single category with high probability regardless of the ground truth. Experimental results have shown that, the embedded watermark can be transferred to various downstream tasks in black-box settings, including node classification, link prediction and community detection, which forms a reliable watermark verification system for GSSL in reality. This approach also shows satisfactory performance in terms of model fidelity, reliability and robustness.

Transferable Watermarking to Self-supervised Pre-trained Graph Encoders by Trigger Embeddings

TL;DR

The paper tackles copyright protection for graph self-supervised learning (GSSL) encoders by introducing a backdoor watermarking scheme that embeds trigger-specific embeddings into the encoder during pretraining. By adding watermark losses and to the usual utility loss, the encoder learns to map trigger ego-graphs to a compact cluster, enabling black-box verification via a concentration score with threshold . Empirical results across multiple GSSL models and downstream tasks show high transferability of the watermark, minimal fidelity loss, and robustness to pruning, along with visualization and ablation analyses validating the design. The approach provides a practical, transferable, and verifiable watermark for GSSL encoders, with future work aimed at resisting model extraction and enhancing watermark resilience.

Abstract

Recent years have witnessed the prosperous development of Graph Self-supervised Learning (GSSL), which enables to pre-train transferable foundation graph encoders. However, the easy-to-plug-in nature of such encoders makes them vulnerable to copyright infringement. To address this issue, we develop a novel watermarking framework to protect graph encoders in GSSL settings. The key idea is to force the encoder to map a set of specially crafted trigger instances into a unique compact cluster in the outputted embedding space during model pre-training. Consequently, when the encoder is stolen and concatenated with any downstream classifiers, the resulting model inherits the `backdoor' of the encoder and predicts the trigger instances to be in a single category with high probability regardless of the ground truth. Experimental results have shown that, the embedded watermark can be transferred to various downstream tasks in black-box settings, including node classification, link prediction and community detection, which forms a reliable watermark verification system for GSSL in reality. This approach also shows satisfactory performance in terms of model fidelity, reliability and robustness.
Paper Structure (22 sections, 6 equations, 4 figures, 2 tables)

This paper contains 22 sections, 6 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overview of the proposed watermarking method.
  • Figure 2: The watermark performance w.r.t. model pruning with different pruning rates. Here, we record the curves of concentration scores of trigger embeddings (CS) and model normal performance in three downstream tasks (ACC, AUC and NMI). (a) GGD, Cora, (b) GGD, Citeseer.
  • Figure 3: t-SNE visualization of the node embeddings generated by watermarked encoders. Scattered nodes in different colors represent the embeddings of nodes in different categories. Particularly, trigger embeddings are in black. (a) GGD, cora, (b) GGD, citeseer, (c) GraphCL, cora, (d) GraphCL, citeseer.
  • Figure 4: The watermark performance (concentration scores) w.r.t. different $\lambda_1$ and $\lambda_2$ settings. (a) GGD, Cora, (b) GGD, Citeseer.