Assessing Image Quality Using a Simple Generative Representation

Simon Raviv; Gal Chechik

Assessing Image Quality Using a Simple Generative Representation

Simon Raviv, Gal Chechik

TL;DR

The paper tackles full-reference image quality assessment (IQA) by addressing the limitations of discriminative, class-focused representations in cross-domain settings. It introduces VAE-QA, a lightweight architecture that leverages a pre-trained variational autoencoder (VAE) latent space, fuses multi-layer features, and predicts MOS via a small MLP. Across standard IQA benchmarks and cross-dataset tests, VAE-QA achieves state-of-the-art generalization while using substantially fewer parameters and faster inference than prior methods. The results indicate that generative latent representations better preserve image details relevant to perceived quality, with potential extensions to video quality assessment and broader generative-model-based quality tasks.

Abstract

Perceptual image quality assessment (IQA) is the task of predicting the visual quality of an image as perceived by a human observer. Current state-of-the-art techniques are based on deep representations trained in discriminative manner. Such representations may ignore visually important features, if they are not predictive of class labels. Recent generative models successfully learn low-dimensional representations using auto-encoding and have been argued to preserve better visual features. Here we leverage existing auto-encoders and propose VAE-QA, a simple and efficient method for predicting image quality in the presence of a full-reference. We evaluate our approach on four standard benchmarks and find that it significantly improves generalization across datasets, has fewer trainable parameters, a smaller memory footprint and faster run time.

Assessing Image Quality Using a Simple Generative Representation

TL;DR

Abstract

Paper Structure (34 sections, 3 equations, 7 figures, 12 tables, 4 algorithms)

This paper contains 34 sections, 3 equations, 7 figures, 12 tables, 4 algorithms.

Introduction
Related Work
Learning-based IQA
Non-learned IQA
Background
Generative Models
Variational Auto Encoders
IQA setups
Our Method
Feature Extraction Module
Feature Fusion Module
Quality Prediction Module
Experiments
Compared Methods
Evaluation Protocol
...and 19 more sections

Figures (7)

Figure 1: VAE-QA architecture: Feature extraction module extracts image representations from input images using a VAE. Feature fusion module combines the extracted image representations to form a compressed representation using within & across VAE layer(s) components. Quality prediction module uses the compressed representation to predict the quality score of the input images using a MLP network.
Figure 2: MOS vs. Predicted MOS for three IQA datasets.
Figure 3: MOS vs. Predicted MOS. Trained on KADID-10k, tested on other IQA datasets.
Figure 4: Quality prediction by distortion type on TID2013 dataset. The figure compares PLCC obtained with our VAE-QA and AHIQ.
Figure 5: The effect of the number of crops on the SRCC.
...and 2 more figures

Assessing Image Quality Using a Simple Generative Representation

TL;DR

Abstract

Assessing Image Quality Using a Simple Generative Representation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)