HarmonyIQA: Pioneering Benchmark and Model for Image Harmonization Quality Assessment
Zitong Xu, Huiyu Duan, Guangji Ma, Liu Yang, Jiarui Wang, Qingbo Wu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet
TL;DR
The paper tackles the misalignment between traditional image quality assessment and human perception in image harmonization by introducing HarmonyIQAD, the first dedicated harmony-quality database with 1,350 harmonized images and 28,350 subjective ratings. It then proposes HarmonyIQA, a large multimodal evaluator that fuses visual features from a vision encoder with user prompts via a pre-trained LLM, enhanced by instruction tuning and LoRA in a double-stage training regime. Empirical results show HarmonyIQA achieves state-of-the-art performance on HarmonyIQAD and competitive results on standard IQA benchmarks, with superior cross-dataset generalization relative to self-supervised baselines. The work provides publicly available resources to advance evaluation and development of both NGIHAs and GIHAs in image harmonization, with practical implications for perceptually aligned IHAs.
Abstract
Image composition involves extracting a foreground object from one image and pasting it into another image through Image harmonization algorithms (IHAs), which aim to adjust the appearance of the foreground object to better match the background. Existing image quality assessment (IQA) methods may fail to align with human visual preference on image harmonization due to the insensitivity to minor color or light inconsistency. To address the issue and facilitate the advancement of IHAs, we introduce the first Image Quality Assessment Database for image Harmony evaluation (HarmonyIQAD), which consists of 1,350 harmonized images generated by 9 different IHAs, and the corresponding human visual preference scores. Based on this database, we propose a Harmony Image Quality Assessment (HarmonyIQA), to predict human visual preference for harmonized images. Extensive experiments show that HarmonyIQA achieves state-of-the-art performance on human visual preference evaluation for harmonized images, and also achieves competing results on traditional IQA tasks. Furthermore, cross-dataset evaluation also shows that HarmonyIQA exhibits better generalization ability than self-supervised learning-based IQA methods. Both HarmonyIQAD and HarmonyIQA will be made publicly available upon paper publication.
