PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control
Kunal Swami, Raghu Chittersu, Pranav Adlinge, Rajeev Irny, Shashavali Doodekula, Alok Shukla
TL;DR
PromptArtisan introduces a zero-shot, single-pass multi-instruction image editing framework that handles multiple mask-prompt pairs via a Complete Attention Control Mechanism (CACM). By computing independent prompt embeddings and enforcing cross- and self-attention controls, it achieves precise, mask-localized edits even with overlapping regions, without additional training or test-time optimization. The approach is validated against state-of-the-art IBE methods on the MiE-Bench dataset, showing superior qualitative and quantitative performance, and is complemented by extensive ablations and additional results. This work enables more flexible, efficient, and scalable image editing workflows for diverse user needs.
Abstract
We present PromptArtisan, a groundbreaking approach to multi-instruction image editing that achieves remarkable results in a single pass, eliminating the need for time-consuming iterative refinement. Our method empowers users to provide multiple editing instructions, each associated with a specific mask within the image. This flexibility allows for complex edits involving mask intersections or overlaps, enabling the realization of intricate and nuanced image transformations. PromptArtisan leverages a pre-trained InstructPix2Pix model in conjunction with a novel Complete Attention Control Mechanism (CACM). This mechanism ensures precise adherence to user instructions, granting fine-grained control over the editing process. Furthermore, our approach is zero-shot, requiring no additional training, and boasts improved processing complexity compared to traditional iterative methods. By seamlessly integrating multi-instruction capabilities, single-pass efficiency, and complete attention control, PromptArtisan unlocks new possibilities for creative and efficient image editing workflows, catering to both novice and expert users alike.
