SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Zheng Gao; Yifan Yang; Xiaoyu Li; Xiaoyan Feng; Haoran Fan; Yang Song; Jiaojiao Jiang

SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Zheng Gao, Yifan Yang, Xiaoyu Li, Xiaoyan Feng, Haoran Fan, Yang Song, Jiaojiao Jiang

Abstract

Watermarking the initial noise of diffusion models has emerged as a promising approach for image provenance, but content-independent noise patterns can be forged via inversion and regeneration attacks. Recent semantic-aware watermarking methods improve robustness by conditioning verification on image semantics. However, their reliance on a single global semantic binding makes them vulnerable to localized but globally coherent semantic edits. To address this limitation and provide a trustworthy semantic-aware watermark, we propose $\underline{\textbf{S}}$emantic $\underline{\textbf{L}}$atent $\underline{\textbf{I}}$njection via $\underline{\textbf{C}}$ompartmentalized $\underline{\textbf{E}}$mbedding ($\textbf{SLICE}$). Our framework decouples image semantics into four semantic factors (subject, environment, action, and detail) and precisely anchors them to distinct regions in the initial Gaussian noise. This fine-grained semantic binding enables advanced watermark verification where semantic tampering is detectable and localizable. We theoretically justify why SLICE enables robust and reliable tamper localization and provides statistical guarantees on false-accept rates. Experimental results demonstrate that SLICE significantly outperforms existing baselines against advanced semantic-guided regeneration attacks, substantially reducing attack success while preserving image quality and semantic fidelity. Overall, SLICE offers a practical, training-free provenance solution that is both fine-grained in diagnosis and robust to realistic adversarial manipulations.

SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Abstract

emantic

atent

njection via

ompartmentalized

mbedding (

). Our framework decouples image semantics into four semantic factors (subject, environment, action, and detail) and precisely anchors them to distinct regions in the initial Gaussian noise. This fine-grained semantic binding enables advanced watermark verification where semantic tampering is detectable and localizable. We theoretically justify why SLICE enables robust and reliable tamper localization and provides statistical guarantees on false-accept rates. Experimental results demonstrate that SLICE significantly outperforms existing baselines against advanced semantic-guided regeneration attacks, substantially reducing attack success while preserving image quality and semantic fidelity. Overall, SLICE offers a practical, training-free provenance solution that is both fine-grained in diagnosis and robust to realistic adversarial manipulations.

Paper Structure (21 sections, 5 theorems, 28 equations, 5 figures, 4 tables)

This paper contains 21 sections, 5 theorems, 28 equations, 5 figures, 4 tables.

Introduction
Roadmap.
Related Work
Watermarking for Generative Models
Diffusion Models and Latent Inversion
Semantic Control and Disentangled Representations
Method
Prompt-Conditioned Factorized Semantic Extraction
Spatially-Partitioned Semantic Injection
Watermark Detection
Theoretical Analysis
Prompt Language Selection
Experiment
Resistance Against Generative Forgery Attacks
Robustness Against Common Image Corruptions
...and 6 more sections

Key Result

Theorem 4.3

Let $\mathcal{J} \subseteq \mathcal{K}$ be the set of tampered semantic factors. Assume that Assumptions as:bound_err and as:sem_pertb hold. If the set of local threshold $\{\tau_k\}_{k\in\mathcal{K}}$ satisfies $\tau_k \geq \epsilon_k + \delta_k$ for all $k \in \mathcal{K} \setminus\mathcal{J}$ and We write $a_+ = \max\{a, 0\}$ for any $a \in \mathbb{R}$.

Figures (5)

Figure 1: The overall framework of SLICE.
Figure 2: Structure of the Meta-Prompt $\mathcal{P}_{\mathrm{meta}}$.
Figure 3: Semantic extraction stability across prompt languages. The axes represent text embedding cosine similarity between initial and re-extracted descriptors.
Figure 4: Qualitative comparison of visual fidelity with and without SLICE watermarking.
Figure 5: Case study of the proposed multi-granularity verification mechanism.

Theorems & Definitions (7)

Theorem 4.3: Robust localization under partial corruption
Theorem 4.4: Exponential false-accept bound for keyless or unwatermarked inputs
Theorem A.1: Restatement of Theorem \ref{['thm:main']}
proof
Lemma B.1: Chernoff bounds, Theorem 2.17 in zhang2023mathematical
Theorem B.2: Restatement of Theorem \ref{['thm:exp']}
proof

SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Abstract

SLICE: Semantic Latent Injection via Compartmentalized Embedding for Image Watermarking

Authors

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (7)