Table of Contents
Fetching ...

DreamEdit: Subject-driven Image Editing

Tianle Li, Max Ku, Cong Wei, Wenhu Chen

TL;DR

The paper tackles controllable subject-driven image editing by introducing two tasks—Subject Replacement and Subject Addition—and presenting DreamEditBench as a standardized benchmark. It proposes DreamEditor, a iterative method that fine-tunes a diffusion model for a target subject, uses segmentation-guided inpainting, and refines results over multiple iterations to balance subject fidelity with background realism. Through extensive human evaluations and comparisons to baselines, the approach demonstrates superior performance on the proposed tasks, while highlighting gaps in automatic metrics. The work aims to establish DreamEditBench as a standard platform to drive future advancements in controllable subject-driven editing.

Abstract

Subject-driven image generation aims at generating images containing customized subjects, which has recently drawn enormous attention from the research community. However, the previous works cannot precisely control the background and position of the target subject. In this work, we aspire to fill the void and propose two novel subject-driven sub-tasks, i.e., Subject Replacement and Subject Addition. The new tasks are challenging in multiple aspects: replacing a subject with a customized one can change its shape, texture, and color, while adding a target subject to a designated position in a provided scene necessitates a context-aware posture. To conquer these two novel tasks, we first manually curate a new dataset DreamEditBench containing 22 different types of subjects, and 440 source images with different difficulty levels. We plan to host DreamEditBench as a platform and hire trained evaluators for standard human evaluation. We also devise an innovative method DreamEditor to resolve these tasks by performing iterative generation, which enables a smooth adaptation to the customized subject. In this project, we conduct automatic and human evaluations to understand the performance of DreamEditor and baselines on DreamEditBench. For Subject Replacement, we found that the existing models are sensitive to the shape and color of the original subject. The model failure rate will dramatically increase when the source and target subjects are highly different. For Subject Addition, we found that the existing models cannot easily blend the customized subjects into the background smoothly, leading to noticeable artifacts in the generated image. We hope DreamEditBench can become a standard platform to enable future investigations toward building more controllable subject-driven image editing. Our project homepage is https://dreameditbenchteam.github.io/.

DreamEdit: Subject-driven Image Editing

TL;DR

The paper tackles controllable subject-driven image editing by introducing two tasks—Subject Replacement and Subject Addition—and presenting DreamEditBench as a standardized benchmark. It proposes DreamEditor, a iterative method that fine-tunes a diffusion model for a target subject, uses segmentation-guided inpainting, and refines results over multiple iterations to balance subject fidelity with background realism. Through extensive human evaluations and comparisons to baselines, the approach demonstrates superior performance on the proposed tasks, while highlighting gaps in automatic metrics. The work aims to establish DreamEditBench as a standard platform to drive future advancements in controllable subject-driven editing.

Abstract

Subject-driven image generation aims at generating images containing customized subjects, which has recently drawn enormous attention from the research community. However, the previous works cannot precisely control the background and position of the target subject. In this work, we aspire to fill the void and propose two novel subject-driven sub-tasks, i.e., Subject Replacement and Subject Addition. The new tasks are challenging in multiple aspects: replacing a subject with a customized one can change its shape, texture, and color, while adding a target subject to a designated position in a provided scene necessitates a context-aware posture. To conquer these two novel tasks, we first manually curate a new dataset DreamEditBench containing 22 different types of subjects, and 440 source images with different difficulty levels. We plan to host DreamEditBench as a platform and hire trained evaluators for standard human evaluation. We also devise an innovative method DreamEditor to resolve these tasks by performing iterative generation, which enables a smooth adaptation to the customized subject. In this project, we conduct automatic and human evaluations to understand the performance of DreamEditor and baselines on DreamEditBench. For Subject Replacement, we found that the existing models are sensitive to the shape and color of the original subject. The model failure rate will dramatically increase when the source and target subjects are highly different. For Subject Addition, we found that the existing models cannot easily blend the customized subjects into the background smoothly, leading to noticeable artifacts in the generated image. We hope DreamEditBench can become a standard platform to enable future investigations toward building more controllable subject-driven image editing. Our project homepage is https://dreameditbenchteam.github.io/.
Paper Structure (19 sections, 3 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 3 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: The leftmost column is the customized subject, the middle column is the Subject Replacement task, the rightmost column is the Subject Addition task. The output is the generated results by DreamEditor
  • Figure 2: The visualization of DreamEditor to iteratively refine the generated target subject.
  • Figure 3: DreamEditor generates the results iteratively. The output of the last iteration will serve as the input for the next. In each iteration, it leverages the dilated mask of subject segmentation and the specialized prompt to guide the DDIM inversion and gradually in-paint a subject more similar to the target one.
  • Figure 4: Results on Subject Replacement Task Compared with Baselines
  • Figure 5: Results on Subject Addition Task Compared with Baselines
  • ...and 2 more figures