Table of Contents
Fetching ...

Z-Erase: Enabling Concept Erasure in Single-Stream Diffusion Transformers

Nanxiang Jiang, Zhaoxin Fan, Baisen Wang, Daiheng Gao, Junhang Cheng, Jifeng Guo, Yalan Qin, Yeying Jin, Hongwei Zheng, Faguo Wu, Wenjun Wu

Abstract

Concept erasure serves as a vital safety mechanism for removing unwanted concepts from text-to-image (T2I) models. While extensively studied in U-Net and dual-stream architectures (e.g., Flux), this task remains under-explored in the recent emerging paradigm of single-stream diffusion transformers (e.g., Z-Image). In this new paradigm, text and image tokens are processed as a single unified sequence via shared parameters. Consequently, directly applying prior erasure methods typically leads to generation collapse. To bridge this gap, we introduce Z-Erase, the first concept erasure method tailored for single-stream T2I models. To guarantee stable image generation, Z-Erase first proposes a Stream Disentangled Concept Erasure Framework that decouples updates and enables existing methods on single-stream models. Subsequently, within this framework, we introduce Lagrangian-Guided Adaptive Erasure Modulation, a constrained algorithm that further balances the sensitive erasure-preservation trade-off. Moreover, we provide a rigorous convergence analysis proving that Z-Erase can converge to a Pareto stationary point. Experiments demonstrate that Z-Erase successfully overcomes the generation collapse issue, achieving state-of-the-art performance across a wide range of tasks.

Z-Erase: Enabling Concept Erasure in Single-Stream Diffusion Transformers

Abstract

Concept erasure serves as a vital safety mechanism for removing unwanted concepts from text-to-image (T2I) models. While extensively studied in U-Net and dual-stream architectures (e.g., Flux), this task remains under-explored in the recent emerging paradigm of single-stream diffusion transformers (e.g., Z-Image). In this new paradigm, text and image tokens are processed as a single unified sequence via shared parameters. Consequently, directly applying prior erasure methods typically leads to generation collapse. To bridge this gap, we introduce Z-Erase, the first concept erasure method tailored for single-stream T2I models. To guarantee stable image generation, Z-Erase first proposes a Stream Disentangled Concept Erasure Framework that decouples updates and enables existing methods on single-stream models. Subsequently, within this framework, we introduce Lagrangian-Guided Adaptive Erasure Modulation, a constrained algorithm that further balances the sensitive erasure-preservation trade-off. Moreover, we provide a rigorous convergence analysis proving that Z-Erase can converge to a Pareto stationary point. Experiments demonstrate that Z-Erase successfully overcomes the generation collapse issue, achieving state-of-the-art performance across a wide range of tasks.

Paper Structure

This paper contains 34 sections, 4 theorems, 55 equations, 17 figures, 10 tables, 1 algorithm.

Key Result

Lemma 3.3

Under Assumptions assum:lipschitz and assum:smoothness, the difference between the true gradient $g_t$ and the approximate gradient $\tilde{g}_t$ used in Algorithm 1 is bounded by the step size $\alpha$:

Figures (17)

  • Figure 1: We introduce Z-Erase, a concept erasure method tailored for single-stream models. First row: Prior methods adapted via our Stream Disentangled Concept Erasure Framework are tested with the "nudity" concept, showing under-erasure (AC, EraseAnything) or severe artifacts (UCE). Blue boxes reveal the generation collapse caused by naive fine-tuning without our adaptation. Second row: Z-Erase effectively removes target concepts while preserving quality. Original outputs are in yellow boxes. Sensitive content pixelated.
  • Figure 2: Single-stream attention analysis. Given the prompt "A girl with a wristwatch on her hand amid cherry blossoms", the attention maps reveal distinct localized responses for text tokens.
  • Figure 3: Left:Attention erasure is brittle. Concept can be erased by zeroing its attention columns $\mathbf{A}_{attn}[:, :, idx] = 0$ once the target token is localized from the prompt. However, this trick shows poor robustness to real-world prompt variations. Right:Naive fine-tuning collapses. Directly fine-tuning shared projections in single-stream models collapses generation with noisy outputs.
  • Figure 4: Dynamic balancing of erasure and preservation gradients.Left: When $\nabla \mathcal{L}_{er}$ does not conflict with $\nabla \mathcal{L}_{pr}$, the erasure gradient itself is the update direction. Right: When the gradients conflict, the update direction $\mathbf{d}^*_t$ is obtained by steering $\nabla \mathcal{L}_{er}$ toward $\nabla \mathcal{L}_{pr}$ until the preservation constraint $\varepsilon$ is satisfied.
  • Figure 5: Single-concept erasure. Our method removes both concrete and abstract concepts while preserving image quality with minimal collateral changes. In contrast, UCE severely distorts images and introduces strong artifacts, whereas EraseAnything may fail to erase or causes noticeable semantic shifts.
  • ...and 12 more figures

Theorems & Definitions (11)

  • proof
  • proof
  • proof
  • Lemma 3.3: Approximation Error Bound
  • proof
  • Lemma 3.4: Total Functional Variation
  • proof
  • Theorem 3.5: Dynamic Regret Bound
  • proof
  • Theorem 3.6: Pareto Stationarity
  • ...and 1 more