Table of Contents
Fetching ...

Structural Similarity in Deep Features: Image Quality Assessment Robust to Geometrically Disparate Reference

Keke Zhang, Weiling Chen, Tiesong Zhao, Zhou Wang

TL;DR

DeepSSIM introduces a unified, non-training-based IQA metric that remains robust to geometric disparities between reference and test images by representing images with Gram-based deep-structure features and comparing their self-correlations. The method builds $R_{DS}(I)=F(I)F(I)^T$ from pre-trained CNN features (notably VGG16 conv5_1) and computes a local correlation-based DeepSSIM score, with a lightweight version (DeepSSIM-Lite) using global correlations for efficiency. Extensive experiments across AR-IQA and GDR-IQA benchmarks show state-of-the-art performance on aligned references and strong robustness to geometric distortions, outperforming many learning-based baselines in cross-dataset settings. The approach also demonstrates potential as an optimization objective for training image restoration and enhancement systems, suggesting broad applicability beyond standard IQA tasks.

Abstract

Image Quality Assessment (IQA) with references plays an important role in optimizing and evaluating computer vision tasks. Traditional methods assume that all pixels of the reference and test images are fully aligned. Such Aligned-Reference IQA (AR-IQA) approaches fail to address many real-world problems with various geometric deformations between the two images. Although significant effort has been made to attack Geometrically-Disparate-Reference IQA (GDR-IQA) problem, it has been addressed in a task-dependent fashion, for example, by dedicated designs for image super-resolution and retargeting, or by assuming the geometric distortions to be small that can be countered by translation-robust filters or by explicit image registrations. Here we rethink this problem and propose a unified, non-training-based Deep Structural Similarity (DeepSSIM) approach to address the above problems in a single framework, which assesses structural similarity of deep features in a simple but efficient way and uses an attention calibration strategy to alleviate attention deviation. The proposed method, without application-specific design, achieves state-of-the-art performance on AR-IQA datasets and meanwhile shows strong robustness to various GDR-IQA test cases. Interestingly, our test also shows the effectiveness of DeepSSIM as an optimization tool for training image super-resolution, enhancement and restoration, implying an even wider generalizability. \footnote{Source code will be made public after the review is completed.

Structural Similarity in Deep Features: Image Quality Assessment Robust to Geometrically Disparate Reference

TL;DR

DeepSSIM introduces a unified, non-training-based IQA metric that remains robust to geometric disparities between reference and test images by representing images with Gram-based deep-structure features and comparing their self-correlations. The method builds from pre-trained CNN features (notably VGG16 conv5_1) and computes a local correlation-based DeepSSIM score, with a lightweight version (DeepSSIM-Lite) using global correlations for efficiency. Extensive experiments across AR-IQA and GDR-IQA benchmarks show state-of-the-art performance on aligned references and strong robustness to geometric distortions, outperforming many learning-based baselines in cross-dataset settings. The approach also demonstrates potential as an optimization objective for training image restoration and enhancement systems, suggesting broad applicability beyond standard IQA tasks.

Abstract

Image Quality Assessment (IQA) with references plays an important role in optimizing and evaluating computer vision tasks. Traditional methods assume that all pixels of the reference and test images are fully aligned. Such Aligned-Reference IQA (AR-IQA) approaches fail to address many real-world problems with various geometric deformations between the two images. Although significant effort has been made to attack Geometrically-Disparate-Reference IQA (GDR-IQA) problem, it has been addressed in a task-dependent fashion, for example, by dedicated designs for image super-resolution and retargeting, or by assuming the geometric distortions to be small that can be countered by translation-robust filters or by explicit image registrations. Here we rethink this problem and propose a unified, non-training-based Deep Structural Similarity (DeepSSIM) approach to address the above problems in a single framework, which assesses structural similarity of deep features in a simple but efficient way and uses an attention calibration strategy to alleviate attention deviation. The proposed method, without application-specific design, achieves state-of-the-art performance on AR-IQA datasets and meanwhile shows strong robustness to various GDR-IQA test cases. Interestingly, our test also shows the effectiveness of DeepSSIM as an optimization tool for training image super-resolution, enhancement and restoration, implying an even wider generalizability. \footnote{Source code will be made public after the review is completed.
Paper Structure (22 sections, 10 equations, 2 figures, 4 tables)

This paper contains 22 sections, 10 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Our methodology and its performances when handling geometrically disparate references. We first construct deep structure representations based on deep features extracted by a pre-trained network. Then, we calculate the similarity between deep structure representations of reference and test images. The proposed DeepSSIM metric is robust to references with geometric disparities.
  • Figure 2: Our proposed DeepSSIM metric. First, we extract deep features ($F(\mathbb{X}), F(Y)$) of reference and distorted images ($\mathbb{X},Y$) from a pre-trained network. Second, we construct deep structure representations ($R_{\rm DS}(\mathbb{X}), R_{\rm DS}(Y)$) based on $F(\mathbb{X})$ and $F(Y)$. Third, we compute the structural similarity between reference and distorted images by comparing their deep structure representations ($R_{\rm DS}(\mathbb{X}), R_{\rm DS}(Y)$).