CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models
Yigit Ekin, Ahmet Burak Yildirim, Erdem Eren Caglar, Aykut Erdem, Erkut Erdem, Aysegul Dundar
TL;DR
Diffusion-based inpainting often hallucinates removed objects when removing content without explicit guidance. The authors present CLIPAway, which leverages AlphaCLIP embeddings to emphasize background regions and uses an MLP to align these embeddings with the IP-Adapter space, applying a background-focused vector subtraction $\mathbf{e}_{\text{final}} = \mathbf{e}_{\text{b}} - ( (\mathbf{e}_{\text{b}} \cdot \mathbf{e}_{\text{f}})/\|\mathbf{e}_{\text{f}}\| ) ( \mathbf{e}_{\text{f}} / \|\mathbf{e}_{\text{f}}\| )$ to suppress foreground content. The approach is plug-and-play and data-agnostic, compatible with multiple diffusion-based inpainting methods, and evaluated on COCO 2017 with quantitative metrics and a user study showing strong preference for CLIPAway. It highlights practical impact for image restoration and editing while noting ethical considerations and limitations, such as slower speed relative to GANs and shadows not removed unless included in the mask.
Abstract
Advanced image editing techniques, particularly inpainting, are essential for seamlessly removing unwanted elements while preserving visual integrity. Traditional GAN-based methods have achieved notable success, but recent advancements in diffusion models have produced superior results due to their training on large-scale datasets, enabling the generation of remarkably realistic inpainted images. Despite their strengths, diffusion models often struggle with object removal tasks without explicit guidance, leading to unintended hallucinations of the removed object. To address this issue, we introduce CLIPAway, a novel approach leveraging CLIP embeddings to focus on background regions while excluding foreground elements. CLIPAway enhances inpainting accuracy and quality by identifying embeddings that prioritize the background, thus achieving seamless object removal. Unlike other methods that rely on specialized training datasets or costly manual annotations, CLIPAway provides a flexible, plug-and-play solution compatible with various diffusion-based inpainting techniques.
