Unsupervised Region-Based Image Editing of Denoising Diffusion Models

Zixiang Li; Yue Song; Renshuai Tao; Xiaohong Jia; Yao Zhao; Wei Wang

Unsupervised Region-Based Image Editing of Denoising Diffusion Models

Zixiang Li, Yue Song, Renshuai Tao, Xiaohong Jia, Yao Zhao, Wei Wang

TL;DR

This work tackles the challenge of discovering and controlling semantic attributes directly within the latent space of pre-trained diffusion models without supervision. It introduces Region-Based Editing (RBE), which leverages the Jacobian of the denoising network with respect to region-specific latent vectors and applies an orthogonal projection to confine edits to a target region using a coarse mask. By combining power iteration to approximate Jacobian directions and masked Jacobian refinements, RBE enables precise local attribute editing while preserving global image structure, achieving state-of-the-art results on multiple datasets and sometimes surpassing supervised methods. The approach broadens the practical impact of diffusion models by enabling unsupervised, region-aware editing with broad applicability across architectures.

Abstract

Although diffusion models have achieved remarkable success in the field of image generation, their latent space remains under-explored. Current methods for identifying semantics within latent space often rely on external supervision, such as textual information and segmentation masks. In this paper, we propose a method to identify semantic attributes in the latent space of pre-trained diffusion models without any further training. By projecting the Jacobian of the targeted semantic region into a low-dimensional subspace which is orthogonal to the non-masked regions, our approach facilitates precise semantic discovery and control over local masked areas, eliminating the need for annotations. We conducted extensive experiments across multiple datasets and various architectures of diffusion models, achieving state-of-the-art performance. In particular, for some specific face attributes, the performance of our proposed method even surpasses that of supervised approaches, demonstrating its superior ability in editing local image properties.

Unsupervised Region-Based Image Editing of Denoising Diffusion Models

TL;DR

Abstract

Unsupervised Region-Based Image Editing of Denoising Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)