2AFC Prompting of Large Multimodal Models for Image Quality Assessment
Hanwei Zhu, Xiangjie Sui, Baoliang Chen, Xuelin Liu, Peilin Chen, Yuming Fang, Shiqi Wang
TL;DR
The paper tackles image quality assessment with large multimodal models by framing IQA as a 2AFC prompting task and using MAP estimation to convert pairwise preferences into a global ranking. It introduces coarse-to-fine pairing rules and three evaluation metrics—consistency, accuracy, and correlation—to systematically quantify IQA ability across diverse datasets. Experiments across eight IQA datasets show that GPT-4V most closely matches human judgments at a coarse level, while open LMMs exhibit biases and struggle with fine-grained discrimination, indicating substantial room for improvement. The work provides a practical benchmark and methodology to guide future development of LMM-based IQA systems and highlights the value of realistic distortions in training data.
Abstract
While abundant research has been conducted on improving high-level visual understanding and reasoning capabilities of large multimodal models~(LMMs), their visual quality assessment~(IQA) ability has been relatively under-explored. Here we take initial steps towards this goal by employing the two-alternative forced choice~(2AFC) prompting, as 2AFC is widely regarded as the most reliable way of collecting human opinions of visual quality. Subsequently, the global quality score of each image estimated by a particular LMM can be efficiently aggregated using the maximum a posterior estimation. Meanwhile, we introduce three evaluation criteria: consistency, accuracy, and correlation, to provide comprehensive quantifications and deeper insights into the IQA capability of five LMMs. Extensive experiments show that existing LMMs exhibit remarkable IQA ability on coarse-grained quality comparison, but there is room for improvement on fine-grained quality discrimination. The proposed dataset sheds light on the future development of IQA models based on LMMs. The codes will be made publicly available at https://github.com/h4nwei/2AFC-LMMs.
