Table of Contents
Fetching ...

RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting

Longjie Zhao, Ziming Hong, Zhenyang Ren, Runnan Chen, Mingming Gong, Tongliang Liu

TL;DR

This work tackles the vulnerability of 3D Gaussian Splatting watermarks to diffusion-based editing by introducing RDSplat, a native 3D watermarking framework that targets low-frequency Gaussian components through covariance regularization and screen-space Mip filtering. It pairs multi-domain frequency control with an efficient surrogate training regime based on Gaussian blur to adversarially fine-tune watermark robustness across novel views, achieving strong invisibility and superior resilience to diffusion and classical distortions. Thorough evaluations on Blender, LLFF, and cross-study analyses demonstrate state-of-the-art performance and the practicality of cross-view watermark decoding for 3D assets. The proposed approach advances copyright protection for 3D assets by providing robust, view-consistent watermarking that survives semantic-level edits while preserving rendering quality.

Abstract

3D Gaussian Splatting (3DGS) has enabled the creation of digital assets and downstream applications, underscoring the need for robust copyright protection via digital watermarking. However, existing 3DGS watermarking methods remain highly vulnerable to diffusion-based editing, which can easily erase embedded provenance. This challenge highlights the urgent need for 3DGS watermarking techniques that are intrinsically resilient to diffusion-based editing. In this paper, we introduce RDSplat, a Robust watermarking paradigm against Diffusion editing for 3D Gaussian Splatting. RDSplat embeds watermarks into 3DGS components that diffusion-based editing inherently preserve, achieved through (i) proactively targeting low-frequency Gaussians and (ii) adversarial training with a diffusion proxy. Specifically, we introduce a multi-domain framework that operates natively in 3DGS space and embeds watermarks into diffusion-editing-preserved low-frequency Gaussians via coordinated covariance regularization and 2D filtering. In addition, we exploit the low-pass filtering behavior of diffusion-based editing by using Gaussian blur as an efficient training surrogate, enabling adversarial fine-tuning that further enhances watermark robustness against diffusion-based editing. Empirically, comprehensive quantitative and qualitative evaluations on three benchmark datasets demonstrate that RDSplat not only maintains superior robustness under diffusion-based editing, but also preserves watermark invisibility, achieving state-of-the-art performance.

RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting

TL;DR

This work tackles the vulnerability of 3D Gaussian Splatting watermarks to diffusion-based editing by introducing RDSplat, a native 3D watermarking framework that targets low-frequency Gaussian components through covariance regularization and screen-space Mip filtering. It pairs multi-domain frequency control with an efficient surrogate training regime based on Gaussian blur to adversarially fine-tune watermark robustness across novel views, achieving strong invisibility and superior resilience to diffusion and classical distortions. Thorough evaluations on Blender, LLFF, and cross-study analyses demonstrate state-of-the-art performance and the practicality of cross-view watermark decoding for 3D assets. The proposed approach advances copyright protection for 3D assets by providing robust, view-consistent watermarking that survives semantic-level edits while preserving rendering quality.

Abstract

3D Gaussian Splatting (3DGS) has enabled the creation of digital assets and downstream applications, underscoring the need for robust copyright protection via digital watermarking. However, existing 3DGS watermarking methods remain highly vulnerable to diffusion-based editing, which can easily erase embedded provenance. This challenge highlights the urgent need for 3DGS watermarking techniques that are intrinsically resilient to diffusion-based editing. In this paper, we introduce RDSplat, a Robust watermarking paradigm against Diffusion editing for 3D Gaussian Splatting. RDSplat embeds watermarks into 3DGS components that diffusion-based editing inherently preserve, achieved through (i) proactively targeting low-frequency Gaussians and (ii) adversarial training with a diffusion proxy. Specifically, we introduce a multi-domain framework that operates natively in 3DGS space and embeds watermarks into diffusion-editing-preserved low-frequency Gaussians via coordinated covariance regularization and 2D filtering. In addition, we exploit the low-pass filtering behavior of diffusion-based editing by using Gaussian blur as an efficient training surrogate, enabling adversarial fine-tuning that further enhances watermark robustness against diffusion-based editing. Empirically, comprehensive quantitative and qualitative evaluations on three benchmark datasets demonstrate that RDSplat not only maintains superior robustness under diffusion-based editing, but also preserves watermark invisibility, achieving state-of-the-art performance.

Paper Structure

This paper contains 33 sections, 15 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: Overview of 3D Watermarking and Attack Mechanisms.Left: Original and watermarked 3DGS models with imperceptible differences; watermarks remain decodable from novel 2D views. Right: Classical attacks (blur, brightness, compression, noise) apply signal-level distortions while preserving watermark integrity. Diffusion-based editing (regeneration, global/local editing) performs semantic-level reconstruction that completely destroys watermarks yet produces visually plausible results that appear natural to humans, enabling covert copyright infringement. A robust 3DGS watermark needs to withstand both attack categories while maintaining invisibility.
  • Figure 2: Robustness comparison: classical vs. diffusion. Each method is shown as a diamond with four bars. The diamond area represents encoding capacity; its y-coordinate shows bit accuracy under classical attacks (averaged across multiple attack types); its x-coordinate shows TPR@1%FPR against diffusion editing. Bar length indicates TPR@1%FPR for each specific attack (longer is better). Due to diffusion-based attacks' destructive nature, evaluation shifts from bit accuracy to watermark detection (TPR@1%FPR). Our method achieves balanced robustness across all attacks on the Blender dataset.
  • Figure 3: Overview of the proposed RDSplat framework. The proposed RDSplat model embeds robust watermarks into 3D Gaussian Splatting (3DGS) representations to protect the copyright of 3D assets against diffusion-based semantic edits. The pipeline consists of three main components: (1) 3D Gaussian optimization, where the original 3DGS is densified and watermarked by modifying only the covariance matrices $\Sigma_i$ of the Gaussians; (2) 2D rendering and distortion simulation, which generates training images under diverse robustness distortions (e.g., hue jitter, Gaussian blur, brightness, JPEG compression); and (3) decoding and adversarial training, where a 2D decoder extracts the embedded watermark while losses such as $L_{\mathrm{MSE}}$, $L_{\mathrm{LPIPS}}$, and $L_{\mathrm{BCE}}$ jointly enforce fidelity, perceptual quality, and watermark robustness. This joint 3D and 2D optimization ensures that the watermark remains decodable from novel 2D views even after high-level semantic or diffusion-based modifications.
  • Figure 4: Sensitivity analysis of Gaussian blur sigma during training. Performance metrics across different blur sigma values ($\sigma$) on the LLFF flower dataset. The x-axis represents the Gaussian blur sigma size used during training. Left two plots show rendering quality metrics (PSNR, MSE), while right three plots demonstrate robustness against blur editing and diffusion editing.
  • Figure A.1: Frequency domain analysis of diffusion based attacks. The figure demonstrates frequency characteristics across eight different editing methods: Freq Rings (baseline frequency analysis), StoInv, DetInv, OmniGen, IP Adapter, Instruct Pix2Pix, DiffEdit, and CtrlNet Inpaint. Top row: Original spatial domain images showing a garden table scene with varying editing effects (note OmniGen's winter transformation). Row 2 (Low): Low frequency components visualized with black rings (smallest). Row 3 (Medium): Medium frequency components visualized with yellow rings (medium sized). Row 4 (High): High frequency components visualized with red rings (largest). Row 5 (Combined): Combined visualization showing all three frequency bands together. A key observation is that most diffusion based editing methods (particularly StoInv, DetInv, IP Adapter, and Instruct Pix2Pix) exhibit similar frequency patterns characterized by low frequency preservation and high frequency attenuation, comparable to classical blurring operations. This suggests that editing models inherently smooth mid to high frequency details to maintain semantic consistency while sacrificing pixel level fidelity.
  • ...and 5 more figures