Table of Contents
Fetching ...

GUISE: Graph GaUssIan Shading watErmark

Renyi Yang

TL;DR

This work adapts the Gaussian Shading, a proven performance lossless watermarking technique, to the latent graph diffusion domain, and simplifies the watermark diffusion process through duplication and padding, making it adaptable and suitable for various message types.

Abstract

In the expanding field of generative artificial intelligence, integrating robust watermarking technologies is essential to protect intellectual property and maintain content authenticity. Traditionally, watermarking techniques have been developed primarily for rich information media such as images and audio. However, these methods have not been adequately adapted for graph-based data, particularly molecular graphs. Latent 3D graph diffusion(LDM-3DG) is an ascendant approach in the molecular graph generation field. This model effectively manages the complexities of molecular structures, preserving essential symmetries and topological features. We adapt the Gaussian Shading, a proven performance lossless watermarking technique, to the latent graph diffusion domain to protect this sophisticated new technology. Our adaptation simplifies the watermark diffusion process through duplication and padding, making it adaptable and suitable for various message types. We conduct several experiments using the LDM-3DG model on publicly available datasets QM9 and Drugs, to assess the robustness and effectiveness of our technique. Our results demonstrate that the watermarked molecules maintain statistical parity in 9 out of 10 performance metrics compared to the original. Moreover, they exhibit a 100% detection rate and a 99% extraction rate in a 2D decoded pipeline, while also showing robustness against post-editing attacks.

GUISE: Graph GaUssIan Shading watErmark

TL;DR

This work adapts the Gaussian Shading, a proven performance lossless watermarking technique, to the latent graph diffusion domain, and simplifies the watermark diffusion process through duplication and padding, making it adaptable and suitable for various message types.

Abstract

In the expanding field of generative artificial intelligence, integrating robust watermarking technologies is essential to protect intellectual property and maintain content authenticity. Traditionally, watermarking techniques have been developed primarily for rich information media such as images and audio. However, these methods have not been adequately adapted for graph-based data, particularly molecular graphs. Latent 3D graph diffusion(LDM-3DG) is an ascendant approach in the molecular graph generation field. This model effectively manages the complexities of molecular structures, preserving essential symmetries and topological features. We adapt the Gaussian Shading, a proven performance lossless watermarking technique, to the latent graph diffusion domain to protect this sophisticated new technology. Our adaptation simplifies the watermark diffusion process through duplication and padding, making it adaptable and suitable for various message types. We conduct several experiments using the LDM-3DG model on publicly available datasets QM9 and Drugs, to assess the robustness and effectiveness of our technique. Our results demonstrate that the watermarked molecules maintain statistical parity in 9 out of 10 performance metrics compared to the original. Moreover, they exhibit a 100% detection rate and a 99% extraction rate in a 2D decoded pipeline, while also showing robustness against post-editing attacks.

Paper Structure

This paper contains 17 sections, 7 equations, 6 figures, 4 tables, 3 algorithms.

Figures (6)

  • Figure 1: Architecture of the latent diffusion model. The left half represents the training phase where data is encoded to latent space and diffused to Gaussian noise. The right half represents the inference phase where noise is denoised to latent representation and decoded to data.
  • Figure 2: GUISE Framework. the watermark is duplicated and encrypted to generate a random bitstream, then we use Gaussian noise sampling to generate the latent, DDIM-sampling, and decode it to create a watermarked molecule. The watermark is extracted by reversing these operations and detected by comparing Hamming distances between bitstreams.
  • Figure 3: Illustation of molecule "C#CCCOC(C)=O" and it's Equivariant Smiles, Addition of Hydrogens and One Less Decoding representation
  • Figure 4: The watermark detection rate against theoretical false positive rate under the watermarked molecule set and three modified molecule sets
  • Figure 5: The detection rate of watermarked and modified datasets against the detection rate of none watermarked datasets, Area under the ROC Curve is reported in legends
  • ...and 1 more figures