EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Zitong Xu; Huiyu Duan; Zhongpeng Ji; Xinyun Zhang; Yutao Liu; Xiongkuo Min; Ke Gu; Jian Zhang; Shusong Xu; Jinwei Chen; Bo Li; Guangtao Zhai

EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Zitong Xu, Huiyu Duan, Zhongpeng Ji, Xinyun Zhang, Yutao Liu, Xiongkuo Min, Ke Gu, Jian Zhang, Shusong Xu, Jinwei Chen, Bo Li, Guangtao Zhai

Abstract

Recent text-guided image editing (TIE) models have achieved remarkable progress, while many edited images still suffer from issues such as artifacts, unexpected editings, unaesthetic contents. Although some benchmarks and methods have been proposed for evaluating edited images, scalable evaluation models are still lacking, which limits the development of human feedback reward models for image editing. To address the challenges, we first introduce \textbf{EditHF-1M}, a million-scale image editing dataset with over 29M human preference pairs and 148K human mean opinion ratings, both evaluated from three dimensions, \textit{i.e.}, visual quality, instruction alignment, and attribute preservation. Based on EditHF-1M, we propose \textbf{EditHF}, a multimodal large language model (MLLM) based evaluation model, to provide human-aligned feedback from image editing. Finally, we introduce \textbf{EditHF-Reward}, which utilizes EditHF as the reward signal to optimize the text-guided image editing models through reinforcement learning. Extensive experiments show that EditHF achieves superior alignment with human preferences and demonstrates strong generalization on other datasets. Furthermore, we fine-tune the Qwen-Image-Edit using EditHF-Reward, achieving significant performance improvements, which demonstrates the ability of EditHF to serve as a reward model to scale-up the image editing. Both the dataset and code will be released in our GitHub repository: https://github.com/IntMeGroup/EditHF.

EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Abstract

Paper Structure (20 sections, 7 equations, 5 figures, 6 tables)

This paper contains 20 sections, 7 equations, 5 figures, 6 tables.

Introduction
Related Work
Image Editing
Benchmarks for Image Editing
Evaluation Metrics for Image Editing
EditHF-1M
Design of Editing Tasks
Data Collection
Subjective Experiment
Data Analysis
EditHF
Model Architecture
Training Strategy
Experiments
Experiment Setup
...and 5 more sections

Figures (5)

Figure 1: Overview of EditHF-1M and human annotations: (a) Ranking: edited images from different editing models are grouped by source image and editing prompt and ranked to assess relative quality; (b) Scoring: each edited image is rated individually for evaluating absolute quality.
Figure 2: Comparison of image editing models using EditHF-1M. Ranking scores are derived from win counts in pairwise group comparisons.
Figure 3: Comparison of image editing models across different editing tasks.
Figure 4: Overview of the EditHF architecture and the editing model refinement process guided by EditHF.
Figure 5: Examples from Qwen-Image-Edit refined by our EditHF and other competitive editing models.

EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Abstract

EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Authors

Abstract

Table of Contents

Figures (5)