BlobCtrl: Taming Controllable Blob for Element-level Image Editing

Yaowei Li; Lingen Li; Zhaoyang Zhang; Xiaoyu Li; Guangzhi Wang; Hongxiang Li; Xiaodong Cun; Ying Shan; Yuexian Zou

BlobCtrl: Taming Controllable Blob for Element-level Image Editing

Yaowei Li, Lingen Li, Zhaoyang Zhang, Xiaoyu Li, Guangzhi Wang, Hongxiang Li, Xiaodong Cun, Ying Shan, Yuexian Zou

TL;DR

This work presents BlobCtrl, a framework for element-level image editing based on a probabilistic blob-based representation that disentangles layout from appearance, affording fine-grained, controllable object-level elements manipulation.

Abstract

As user expectations for image editing continue to rise, the demand for flexible, fine-grained manipulation of specific visual elements presents a challenge for current diffusion-based methods. In this work, we present BlobCtrl, a framework for element-level image editing based on a probabilistic blob-based representation. Treating blobs as visual primitives, BlobCtrl disentangles layout from appearance, affording fine-grained, controllable object-level manipulation. Our key contributions are twofold: (1) an in-context dual-branch diffusion model that separates foreground and background processing, incorporating blob representations to explicitly decouple layout and appearance, and (2) a self-supervised disentangle-then-reconstruct training paradigm with an identity-preserving loss function, along with tailored strategies to efficiently leverage blob-image pairs. To foster further research, we introduce BlobData for large-scale training and BlobBench, a benchmark for systematic evaluation. Experimental results demonstrate that BlobCtrl achieves state-of-the-art performance in a variety of element-level editing tasks, such as object addition, removal, scaling, and replacement, while maintaining computational efficiency. Project Webpage: https://liyaowei-stu.github.io/project/BlobCtrl/

BlobCtrl: Taming Controllable Blob for Element-level Image Editing

TL;DR

Abstract

BlobCtrl: Taming Controllable Blob for Element-level Image Editing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)