Table of Contents
Fetching ...

GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

Lijiang Li, Jinglu Wang, Xiang Ming, Yan Lu

TL;DR

GS-Marker tackles the challenge of generalizable, robust watermarking for 3D Gaussian Splatting by introducing a single-pass framework that embeds watermarks into 3DGS via a 3D encoder, distortion layers, and a 2D decoder. A key contribution is the Adaptive Marker Control mechanism, which perturbatively perturbs the 3DGS to escape local minima and balance watermark decoding with rendering fidelity during training. Empirical results across source and target domains show that GS-Marker achieves superior decoding accuracy and rendering quality while significantly reducing embedding time compared with per-scene optimization baselines. This work enables scalable, robust invisible watermarking for 3D assets in Generative AI pipelines, with practical impact on media provenance and asset protection while maintaining high visual fidelity.

Abstract

In the Generative AI era, safeguarding 3D models has become increasingly urgent. While invisible watermarking is well-established for 2D images with encoder-decoder frameworks, generalizable and robust solutions for 3D remain elusive. The main difficulty arises from the renderer between the 3D encoder and 2D decoder, which disrupts direct gradient flow and complicates training. Existing 3D methods typically rely on per-scene iterative optimization, resulting in time inefficiency and limited generalization. In this work, we propose a single-pass watermarking approach for 3D Gaussian Splatting (3DGS), a well-known yet underexplored representation for watermarking. We identify two major challenges: (1) ensuring effective training generalized across diverse 3D models, and (2) reliably extracting watermarks from free-view renderings, even under distortions. Our framework, named GS-Marker, incorporates a 3D encoder to embed messages, distortion layers to enhance resilience against various distortions, and a 2D decoder to extract watermarks from renderings. A key innovation is the Adaptive Marker Control mechanism that adaptively perturbs the initially optimized 3DGS, escaping local minima and improving both training stability and convergence. Extensive experiments show that GS-Marker outperforms per-scene training approaches in terms of decoding accuracy and model fidelity, while also significantly reducing computation time.

GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

TL;DR

GS-Marker tackles the challenge of generalizable, robust watermarking for 3D Gaussian Splatting by introducing a single-pass framework that embeds watermarks into 3DGS via a 3D encoder, distortion layers, and a 2D decoder. A key contribution is the Adaptive Marker Control mechanism, which perturbatively perturbs the 3DGS to escape local minima and balance watermark decoding with rendering fidelity during training. Empirical results across source and target domains show that GS-Marker achieves superior decoding accuracy and rendering quality while significantly reducing embedding time compared with per-scene optimization baselines. This work enables scalable, robust invisible watermarking for 3D assets in Generative AI pipelines, with practical impact on media provenance and asset protection while maintaining high visual fidelity.

Abstract

In the Generative AI era, safeguarding 3D models has become increasingly urgent. While invisible watermarking is well-established for 2D images with encoder-decoder frameworks, generalizable and robust solutions for 3D remain elusive. The main difficulty arises from the renderer between the 3D encoder and 2D decoder, which disrupts direct gradient flow and complicates training. Existing 3D methods typically rely on per-scene iterative optimization, resulting in time inefficiency and limited generalization. In this work, we propose a single-pass watermarking approach for 3D Gaussian Splatting (3DGS), a well-known yet underexplored representation for watermarking. We identify two major challenges: (1) ensuring effective training generalized across diverse 3D models, and (2) reliably extracting watermarks from free-view renderings, even under distortions. Our framework, named GS-Marker, incorporates a 3D encoder to embed messages, distortion layers to enhance resilience against various distortions, and a 2D decoder to extract watermarks from renderings. A key innovation is the Adaptive Marker Control mechanism that adaptively perturbs the initially optimized 3DGS, escaping local minima and improving both training stability and convergence. Extensive experiments show that GS-Marker outperforms per-scene training approaches in terms of decoding accuracy and model fidelity, while also significantly reducing computation time.

Paper Structure

This paper contains 30 sections, 8 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: (a) Generalizable 2D watermarking handles multiple images and messages in a single forward pass. (b) Extending to 3D is challenging due to the renderer bottleneck that disrupts direct gradient flow between the encoder and decoder. (c) Existing 3D approaches copyrnerfwaterfgs_hiderNEURIPS2024_39cee562jang20243d often rely on iterative optimization for each model and message, which is time-consuming. In this paper, we tackle generalizable 3D watermarking, enabling single-pass embedding of watermarks into 3D models and robust extraction from their rendered images.
  • Figure 2: Escape local minimum. Since the input $\mathcal{X}$ lies near a local minimum of the rendering loss $\mathcal{L}_r$, its gradient is near 0. The total loss gradient $\nabla_{{\mathcal{X}}}\,\mathcal{L} \approx \lambda\,\nabla_{{\mathcal{X}}}\,\mathcal{L}_w({\mathcal{X}})$, which can conflict with $\mathcal{L}_r$. To address this, we introduce a suitable perturbation $\boldsymbol{\Delta}$ for $\mathcal{X}$, helping the model escape the local minimum and balance both objectives effectively.
  • Figure 3: Method overview. Given the input 3DGS $\mathcal{X}$ and the watermark $M$, GS-Marker embeds $M$ into $\mathcal{X}$ within a single forward pass, allowing $M$ to be extracted from rendered images even after undergoing both 3D and 2D distortions. Our framework comprises three key components: (1) a 3D encoding module, which applies adaptive marker control to adjust $\mathcal{X}$, forming $\tilde{\mathcal{X}}$ to escape local minima, followed by an embedding network that transforms $(\tilde{\mathcal{X}}, M)$ into the watermaked 3DGS $\hat{\mathcal{X}}$. (2) a distortion module, incorporating 3D and 2D distortions to enhance robustness; and (3) a 2D decoding module, which extracts the watermark from rendered images.
  • Figure 4: Qualitative comparisons between our method and the baseline. We show the differences (×10) between the images rendered by the input and watermarked 3DGS. Our method achieves better PSNR and bit accuracy than baseline.
  • Figure 5: Qualitative comparisons between our method and WateRF waterf on the Blender dataset nerf. Our method achieves better visual quality than WateRF.
  • ...and 4 more figures