LocateEdit-Bench: A Benchmark for Instruction-Based Editing Localization
Shiyu Wu, Shuyan Li, Jing Li, Jing Liu, Yequan Wang
TL;DR
This work tackles the challenge of localizing edits produced by modern instruction-based image editing models, which lack explicit edit masks and semantically camouflage changes. It introduces LocateEdit-Bench, a $231K$-sample dataset generated with $4$ editors across $3$ edit types, plus a multi-metric evaluation framework to assess localization methods. Through extensive benchmarking, the study reveals that state-of-the-art localization techniques struggle with instruction-based edits and fare poorly in cross-editor generalization, highlighting the need for editor-agnostic, semantically aware approaches. The dataset and findings provide a critical foundation for advancing robust forgery localization in the era of AI-powered image manipulation, with open-sourcing planned upon acceptance.
Abstract
Recent advancements in image editing have enabled highly controllable and semantically-aware alteration of visual content, posing unprecedented challenges to manipulation localization. However, existing AI-generated forgery localization methods primarily focus on inpainting-based manipulations, making them ineffective against the latest instruction-based editing paradigms. To bridge this critical gap, we propose LocateEdit-Bench, a large-scale dataset comprising $231$K edited images, designed specifically to benchmark localization methods against instruction-driven image editing. Our dataset incorporates four cutting-edge editing models and covers three common edit types. We conduct a detailed analysis of the dataset and develop two multi-metric evaluation protocols to assess existing localization methods. Our work establishes a foundation to keep pace with the evolving landscape of image editing, thereby facilitating the development of effective methods for future forgery localization. Dataset will be open-sourced upon acceptance.
