Table of Contents
Fetching ...

GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting

Chenxin Li, Hengyu Liu, Zhiwen Fan, Wuyang Li, Yifan Liu, Panwang Pan, Yixuan Yuan

TL;DR

GaussianStego tackles the challenge of embedding customizable, imperceptible, and recoverable information within generated 3D Gaussians used in Gaussian Splatting. It introduces a cross-attention-based embedding stage that injects hidden information into intermediate 3D generation features, followed by a U-Net decoder that recovers the hidden content from renders viewed from a predetermined pose, while an adaptive gradient harmonization strategy protects rendering quality. The approach supports 2D watermarks and multimodal content (text, QR codes, audio, video) with modality-specific decoding and demonstrates generalization to unseen objects, plus robustness to common image perturbations. Quantitative and qualitative results show competitive watermark recovery and high preservation of rendering fidelity, marking a significant step toward practical 3D content copyrighting and authentication in generative pipelines.

Abstract

Recent advancements in large generative models and real-time neural rendering using point-based techniques pave the way for a future of widespread visual data distribution through sharing synthesized 3D assets. However, while standardized methods for embedding proprietary or copyright information, either overtly or subtly, exist for conventional visual content such as images and videos, this issue remains unexplored for emerging generative 3D formats like Gaussian Splatting. We present GaussianStego, a method for embedding steganographic information in the rendering of generated 3D assets. Our approach employs an optimization framework that enables the accurate extraction of hidden information from images rendered using Gaussian assets derived from large models, while maintaining their original visual quality. We conduct preliminary evaluations of our method across several potential deployment scenarios and discuss issues identified through analysis. GaussianStego represents an initial exploration into the novel challenge of embedding customizable, imperceptible, and recoverable information within the renders produced by current 3D generative models, while ensuring minimal impact on the rendered content's quality.

GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting

TL;DR

GaussianStego tackles the challenge of embedding customizable, imperceptible, and recoverable information within generated 3D Gaussians used in Gaussian Splatting. It introduces a cross-attention-based embedding stage that injects hidden information into intermediate 3D generation features, followed by a U-Net decoder that recovers the hidden content from renders viewed from a predetermined pose, while an adaptive gradient harmonization strategy protects rendering quality. The approach supports 2D watermarks and multimodal content (text, QR codes, audio, video) with modality-specific decoding and demonstrates generalization to unseen objects, plus robustness to common image perturbations. Quantitative and qualitative results show competitive watermark recovery and high preservation of rendering fidelity, marking a significant step toward practical 3D content copyrighting and authentication in generative pipelines.

Abstract

Recent advancements in large generative models and real-time neural rendering using point-based techniques pave the way for a future of widespread visual data distribution through sharing synthesized 3D assets. However, while standardized methods for embedding proprietary or copyright information, either overtly or subtly, exist for conventional visual content such as images and videos, this issue remains unexplored for emerging generative 3D formats like Gaussian Splatting. We present GaussianStego, a method for embedding steganographic information in the rendering of generated 3D assets. Our approach employs an optimization framework that enables the accurate extraction of hidden information from images rendered using Gaussian assets derived from large models, while maintaining their original visual quality. We conduct preliminary evaluations of our method across several potential deployment scenarios and discuss issues identified through analysis. GaussianStego represents an initial exploration into the novel challenge of embedding customizable, imperceptible, and recoverable information within the renders produced by current 3D generative models, while ensuring minimal impact on the rendered content's quality.
Paper Structure (14 sections, 4 equations, 6 figures, 1 table)

This paper contains 14 sections, 4 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: GaussianStego training overview. During (a) Hidden Information Embedding, GaussianStego incorporates the DINOv2 features of the hidden information into the intermediate feature of Gaussian generation via cross-attention. In (b) Hidden Information Recovery, a U-Net-based decoder is employed to retrieve the hidden information from the rendered image under the checking pose. Through the optimization process, (c) Adaptive Gradient Harmonization is utilized to maintain a balance between the rendering and hidden recovery.
  • Figure 2: Qualitative comparison on the test objects of the Objaverse dataset. Within each column, we show the rendering images on check pose and and the recovered hidden images.
  • Figure 3: Quantitative comparison on widely-used test images by image-to-3D models. Within each column, we show the rendering and the recovered hidden images.
  • Figure 4: Quantitative results of GaussianStego with multimodal information being embedded.
  • Figure 5: Ablation study on the proposed key components.
  • ...and 1 more figures