X-SG$^2$S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks
Zihang Cheng, Huiping Zhuang, Chun Li, Xin Meng, Ming Li, Fei Richard Yu, Liqiang Nie
TL;DR
The paper tackles the challenge of copyright protection for 3D Gaussian Splatting by introducing X-SG$^2$S, a framework that can hide multimodal watermarks (1D, 2D, and 3D) inside a 3DGS scene without modifying its parameters. It proposes an injector-extractor architecture guided by a self-adaption gate and a learnable selection gate, employing XD-Injection and XD-Extraction heads to distribute and recover watermark data across SH representations. The approach uses patch-based data augmentation, a patching strategy for 1D/3D data, and a feature sparse DCAE for 2D data, achieving robust, high-fidelity watermarking that remains compatible with pretrained pipelines. Empirical results demonstrate strong fidelity, robustness to point pruning, precise watermark identification with low false positives, and ablation-supported design choices, establishing a practical baseline for multimodal watermarking in 3DGS systems.
Abstract
3D Gaussian Splatting (3DGS) has been widely used in 3D reconstruction and 3D generation. Training to get a 3DGS scene often takes a lot of time and resources and even valuable inspiration. The increasing amount of 3DGS digital asset have brought great challenges to the copyright protection. However, it still lacks profound exploration targeted at 3DGS. In this paper, we propose a new framework X-SG$^2$S which can simultaneously watermark 1 to 3D messages while keeping the original 3DGS scene almost unchanged. Generally, we have a X-SG$^2$S injector for adding multi-modal messages simultaneously and an extractor for extract them. Specifically, we first split the watermarks into message patches in a fixed manner and sort the 3DGS points. A self-adaption gate is used to pick out suitable location for watermarking. Then use a XD(multi-dimension)-injection heads to add multi-modal messages into sorted 3DGS points. A learnable gate can recognize the location with extra messages and XD-extraction heads can restore hidden messages from the location recommended by the learnable gate. Extensive experiments demonstrated that the proposed X-SG$^2$S can effectively conceal multi modal messages without changing pretrained 3DGS pipeline or the original form of 3DGS parameters. Meanwhile, with simple and efficient model structure and high practicality, X-SG$^2$S still shows good performance in hiding and extracting multi-modal inner structured or unstructured messages. X-SG$^2$S is the first to unify 1 to 3D watermarking model for 3DGS and the first framework to add multi-modal watermarks simultaneous in one 3DGS which pave the wave for later researches.
