AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

Hanwei Zhu; Yu Tian; Keyan Ding; Baoliang Chen; Bolin Chen; Shiqi Wang; Weisi Lin

AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

Hanwei Zhu, Yu Tian, Keyan Ding, Baoliang Chen, Bolin Chen, Shiqi Wang, Weisi Lin

TL;DR

AgenticIQA introduces a modular, agent-based framework for image quality assessment that jointly optimizes scoring accuracy and interpretability. By decomposing IQA into distortion detection, distortion analysis, tool selection, and tool execution under a plan–execute–summarize loop, it achieves adaptive, query-aware evaluations that combine traditional IQA tools with vision-language model reasoning. A large, structured AgenticIQA-200K dataset and the AgenticIQA-Eval benchmark support training and evaluation of planning, execution, and summarization capabilities. Experimental results across diverse IQA benchmarks demonstrate superior scoring precision and explanation quality compared with both score-based and VLM-only baselines, highlighting the practical impact of agentic reasoning for robust perceptual quality assessment.

Abstract

Image quality assessment (IQA) is inherently complex, as it reflects both the quantification and interpretation of perceptual quality rooted in the human visual system. Conventional approaches typically rely on fixed models to output scalar scores, limiting their adaptability to diverse distortions, user-specific queries, and interpretability needs. Furthermore, scoring and interpretation are often treated as independent processes, despite their interdependence: interpretation identifies perceptual degradations, while scoring abstracts them into a compact metric. To address these limitations, we propose AgenticIQA, a modular agentic framework that integrates vision-language models (VLMs) with traditional IQA tools in a dynamic, query-aware manner. AgenticIQA decomposes IQA into four subtasks -- distortion detection, distortion analysis, tool selection, and tool execution -- coordinated by a planner, executor, and summarizer. The planner formulates task-specific strategies, the executor collects perceptual evidence via tool invocation, and the summarizer integrates this evidence to produce accurate scores with human-aligned explanations. To support training and evaluation, we introduce AgenticIQA-200K, a large-scale instruction dataset tailored for IQA agents, and AgenticIQA-Eval, the first benchmark for assessing the planning, execution, and summarization capabilities of VLM-based IQA agents. Extensive experiments across diverse IQA datasets demonstrate that AgenticIQA consistently surpasses strong baselines in both scoring accuracy and explanatory alignment.

AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

TL;DR

Abstract

AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)