Table of Contents
Fetching ...

X-SG$^2$S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks

Zihang Cheng, Huiping Zhuang, Chun Li, Xin Meng, Ming Li, Fei Richard Yu, Liqiang Nie

TL;DR

The paper tackles the challenge of copyright protection for 3D Gaussian Splatting by introducing X-SG$^2$S, a framework that can hide multimodal watermarks (1D, 2D, and 3D) inside a 3DGS scene without modifying its parameters. It proposes an injector-extractor architecture guided by a self-adaption gate and a learnable selection gate, employing XD-Injection and XD-Extraction heads to distribute and recover watermark data across SH representations. The approach uses patch-based data augmentation, a patching strategy for 1D/3D data, and a feature sparse DCAE for 2D data, achieving robust, high-fidelity watermarking that remains compatible with pretrained pipelines. Empirical results demonstrate strong fidelity, robustness to point pruning, precise watermark identification with low false positives, and ablation-supported design choices, establishing a practical baseline for multimodal watermarking in 3DGS systems.

Abstract

3D Gaussian Splatting (3DGS) has been widely used in 3D reconstruction and 3D generation. Training to get a 3DGS scene often takes a lot of time and resources and even valuable inspiration. The increasing amount of 3DGS digital asset have brought great challenges to the copyright protection. However, it still lacks profound exploration targeted at 3DGS. In this paper, we propose a new framework X-SG$^2$S which can simultaneously watermark 1 to 3D messages while keeping the original 3DGS scene almost unchanged. Generally, we have a X-SG$^2$S injector for adding multi-modal messages simultaneously and an extractor for extract them. Specifically, we first split the watermarks into message patches in a fixed manner and sort the 3DGS points. A self-adaption gate is used to pick out suitable location for watermarking. Then use a XD(multi-dimension)-injection heads to add multi-modal messages into sorted 3DGS points. A learnable gate can recognize the location with extra messages and XD-extraction heads can restore hidden messages from the location recommended by the learnable gate. Extensive experiments demonstrated that the proposed X-SG$^2$S can effectively conceal multi modal messages without changing pretrained 3DGS pipeline or the original form of 3DGS parameters. Meanwhile, with simple and efficient model structure and high practicality, X-SG$^2$S still shows good performance in hiding and extracting multi-modal inner structured or unstructured messages. X-SG$^2$S is the first to unify 1 to 3D watermarking model for 3DGS and the first framework to add multi-modal watermarks simultaneous in one 3DGS which pave the wave for later researches.

X-SG$^2$S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks

TL;DR

The paper tackles the challenge of copyright protection for 3D Gaussian Splatting by introducing X-SGS, a framework that can hide multimodal watermarks (1D, 2D, and 3D) inside a 3DGS scene without modifying its parameters. It proposes an injector-extractor architecture guided by a self-adaption gate and a learnable selection gate, employing XD-Injection and XD-Extraction heads to distribute and recover watermark data across SH representations. The approach uses patch-based data augmentation, a patching strategy for 1D/3D data, and a feature sparse DCAE for 2D data, achieving robust, high-fidelity watermarking that remains compatible with pretrained pipelines. Empirical results demonstrate strong fidelity, robustness to point pruning, precise watermark identification with low false positives, and ablation-supported design choices, establishing a practical baseline for multimodal watermarking in 3DGS systems.

Abstract

3D Gaussian Splatting (3DGS) has been widely used in 3D reconstruction and 3D generation. Training to get a 3DGS scene often takes a lot of time and resources and even valuable inspiration. The increasing amount of 3DGS digital asset have brought great challenges to the copyright protection. However, it still lacks profound exploration targeted at 3DGS. In this paper, we propose a new framework X-SGS which can simultaneously watermark 1 to 3D messages while keeping the original 3DGS scene almost unchanged. Generally, we have a X-SGS injector for adding multi-modal messages simultaneously and an extractor for extract them. Specifically, we first split the watermarks into message patches in a fixed manner and sort the 3DGS points. A self-adaption gate is used to pick out suitable location for watermarking. Then use a XD(multi-dimension)-injection heads to add multi-modal messages into sorted 3DGS points. A learnable gate can recognize the location with extra messages and XD-extraction heads can restore hidden messages from the location recommended by the learnable gate. Extensive experiments demonstrated that the proposed X-SGS can effectively conceal multi modal messages without changing pretrained 3DGS pipeline or the original form of 3DGS parameters. Meanwhile, with simple and efficient model structure and high practicality, X-SGS still shows good performance in hiding and extracting multi-modal inner structured or unstructured messages. X-SGS is the first to unify 1 to 3D watermarking model for 3DGS and the first framework to add multi-modal watermarks simultaneous in one 3DGS which pave the wave for later researches.

Paper Structure

This paper contains 32 sections, 11 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Application scenario of the proposed. Scenario 1: after training a 3DGS pipeline by a trainer, an X-SG$^2$S injector should be combined for joint use. In this way, when other users employ the trainer's model, they can obtain a 3DGS result with some watermarks. By providing the 3DGS file to X-SG$^2$S's extractor, the trainer can determine whether a user has generated the result by using the trainer's model. Scenario 2: if a user has specific requirements to add extra information to the 3DGS result for a particular purpose, they can directly use X-SG$^2$S to achieve this goal. For example, you can add a text with "ACMMM", an image of the ICML logo and a small 3DGS object at the same time by using X-SG$^2$S's injector and extract them by using X-SG$^2$S's extractor.
  • Figure 2: This figure shows the structure of self-adaption gate.
  • Figure 3: This figure shows the structure of feature sparse DCAE.
  • Figure 4: This figure shows we how to redundantize and patch the raw data to let the watermarking more robust.
  • Figure 5: This figure shows the structure and training process of X-SG$^2$S. Orange lines mean there exists gradient flow. Blue lines mean the gradient are truncated. Purple lines mean the GS cloud is sorted by sorting method. Black lines mean the losses used to train.
  • ...and 4 more figures