ParallelEdits: Efficient Multi-object Image Editing
Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Suresh Lokhande, Siwei Lyu
TL;DR
ParallelEdits tackles multi-aspect text-driven image editing by integrating edits across multiple attributes into diffusion steps through a fixed multi-branch architecture guided by an attention-aggregation mechanism. It introduces aspect grouping to partition edits into $N$ branches and performs inversion-free, branch-calibrated updates with cross-branch interactions to preserve content while editing multiple attributes simultaneously. The PIE-Bench++ dataset is proposed to benchmark multi-aspect editing, and experiments show superior editing accuracy and content preservation compared with state-of-the-art baselines, at a manageable computational cost. Overall, the work advances efficient, scalable multi-attribute editing in diffusion models and provides a robust benchmark for evaluating such methods, while noting remaining limitations and potential safeguards for deployment.
Abstract
Text-driven image synthesis has made significant advancements with the development of diffusion models, transforming how visual content is generated from text prompts. Despite these advances, text-driven image editing, a key area in computer graphics, faces unique challenges. A major challenge is making simultaneous edits across multiple objects or attributes. Applying these methods sequentially for multi-attribute edits increases computational demands and efficiency losses. In this paper, we address these challenges with significant contributions. Our main contribution is the development of ParallelEdits, a method that seamlessly manages simultaneous edits across multiple attributes. In contrast to previous approaches, ParallelEdits not only preserves the quality of single attribute edits but also significantly improves the performance of multitasking edits. This is achieved through innovative attention distribution mechanism and multi-branch design that operates across several processing heads. Additionally, we introduce the PIE-Bench++ dataset, an expansion of the original PIE-Bench dataset, to better support evaluating image-editing tasks involving multiple objects and attributes simultaneously. This dataset is a benchmark for evaluating text-driven image editing methods in multifaceted scenarios.
