Edit One for All: Interactive Batch Image Editing
Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee
TL;DR
Interactive batch image editing addresses transferring a user-specified edit from one example image to a set of unseen images using StyleGAN2 latent space. It learns a globally consistent latent-direction $\Delta^{*}_w$ and per-image strength $\alpha_i$ so all edited outputs converge to the same final state, enabling fast, consistent batch edits. It achieves visual quality on par with single-image editing baselines while substantially reducing manual annotation and time, and generalizes across domains like faces, animals, and bodies. The approach frames the problem geometrically in latent space via a semantic hyperplane and indicates potential extension to diffusion models in future work.
Abstract
In recent years, image editing has advanced remarkably. With increased human control, it is now possible to edit an image in a plethora of ways; from specifying in text what we want to change, to straight up dragging the contents of the image in an interactive point-based manner. However, most of the focus has remained on editing single images at a time. Whether and how we can simultaneously edit large batches of images has remained understudied. With the goal of minimizing human supervision in the editing process, this paper presents a novel method for interactive batch image editing using StyleGAN as the medium. Given an edit specified by users in an example image (e.g., make the face frontal), our method can automatically transfer that edit to other test images, so that regardless of their initial state (pose), they all arrive at the same final state (e.g., all facing front). Extensive experiments demonstrate that edits performed using our method have similar visual quality to existing single-image-editing methods, while having more visual consistency and saving significant time and human effort.
