Table of Contents
Fetching ...

EvoIQA - Explaining Image Distortions with Evolved White-Box Logic

Ruchika Gupta, Illya Bakurov, Nathan Haut, Wolfgang Banzhaf

Abstract

Traditional Image Quality Assessment (IQA) metrics typically fall into one of two extremes: rigid, hand-crafted mathematical models or "black-box" deep learning architectures that completely lack interpretability. To bridge this gap, we propose EvoIQA, a fully explainable symbolic regression framework based on Genetic Programming that Evolves explicit, human-readable mathematical formulas for image quality assessment (IQA). Utilizing a rich terminal set from the VSI, VIF, FSIM, and HaarPSI metrics, our framework inherently maps structural, chromatic, and information-theoretic degradations into observable mathematical equations. Our results demonstrate that the evolved GP models consistently achieve strong alignment between the predictions and human visual preferences. Furthermore, they not only outperform traditional hand-crafted metrics but also achieve performance parity with complex, state-of-the-art deep learning models like DB-CNN, proving that we no longer have to sacrifice interpretability for state-of-the-art performance.

EvoIQA - Explaining Image Distortions with Evolved White-Box Logic

Abstract

Traditional Image Quality Assessment (IQA) metrics typically fall into one of two extremes: rigid, hand-crafted mathematical models or "black-box" deep learning architectures that completely lack interpretability. To bridge this gap, we propose EvoIQA, a fully explainable symbolic regression framework based on Genetic Programming that Evolves explicit, human-readable mathematical formulas for image quality assessment (IQA). Utilizing a rich terminal set from the VSI, VIF, FSIM, and HaarPSI metrics, our framework inherently maps structural, chromatic, and information-theoretic degradations into observable mathematical equations. Our results demonstrate that the evolved GP models consistently achieve strong alignment between the predictions and human visual preferences. Furthermore, they not only outperform traditional hand-crafted metrics but also achieve performance parity with complex, state-of-the-art deep learning models like DB-CNN, proving that we no longer have to sacrifice interpretability for state-of-the-art performance.
Paper Structure (19 sections, 16 equations, 3 figures, 7 tables)

This paper contains 19 sections, 16 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Hierarchical Decomposition of JPEG2000 (JP2K) Compression. Each row illustrates the transition from visual artifacts to spatial weight maps and their corresponding AGGD signatures. Note the heavy compression in Row 1 results in a sharper peaked distribution.
  • Figure 2: Qualitative comparison of color saturation artifacts in an unseen KADID-10k dataset. The reference image (left) is compared against five increasing levels of saturation distortion. While the structural HaarPSI baseline remains invariant to the last chromatic shift, our EvoIQA models leverage a learned feature subset—specifically the evolved Chrominance Similarity ($S_{mn}$) maps—to successfully track perceptual degradation. The resulting predictions show a high monotonic alignment with human Mean Opinion Scores (MOS), whereas the baseline results (highlighted in red) fail to capture the intensity of the distortion.
  • Figure 3: Localized vs. Global Distortion Analysis. Comparison of global noise (a) and local block distortion (b). While HaarPSI (c, e) blurs artifacts through multi-resolution pooling, our VSI Gradient map ($s_{gm}$) explicitly isolates unnatural edge discontinuities (d, f). Under identical means ($\mu \approx 0.98$), localized blocks trigger a $2\times$ spike in standard deviation ($\sigma = 0.11$ vs. $0.05$). This Gradient Heterogeneity allows our evolved GP model (Eq. \ref{['eq:evoiq-subset']}) to penalize structural collapse via the Coefficient of Variation ($\Omega_{cv} = \sigma_{gm}/\mu_{gm}$), identifying distortions that traditional pooling obscures.