Structural Similarity in Deep Features: Image Quality Assessment Robust to Geometrically Disparate Reference
Keke Zhang, Weiling Chen, Tiesong Zhao, Zhou Wang
TL;DR
DeepSSIM introduces a unified, non-training-based IQA metric that remains robust to geometric disparities between reference and test images by representing images with Gram-based deep-structure features and comparing their self-correlations. The method builds $R_{DS}(I)=F(I)F(I)^T$ from pre-trained CNN features (notably VGG16 conv5_1) and computes a local correlation-based DeepSSIM score, with a lightweight version (DeepSSIM-Lite) using global correlations for efficiency. Extensive experiments across AR-IQA and GDR-IQA benchmarks show state-of-the-art performance on aligned references and strong robustness to geometric distortions, outperforming many learning-based baselines in cross-dataset settings. The approach also demonstrates potential as an optimization objective for training image restoration and enhancement systems, suggesting broad applicability beyond standard IQA tasks.
Abstract
Image Quality Assessment (IQA) with references plays an important role in optimizing and evaluating computer vision tasks. Traditional methods assume that all pixels of the reference and test images are fully aligned. Such Aligned-Reference IQA (AR-IQA) approaches fail to address many real-world problems with various geometric deformations between the two images. Although significant effort has been made to attack Geometrically-Disparate-Reference IQA (GDR-IQA) problem, it has been addressed in a task-dependent fashion, for example, by dedicated designs for image super-resolution and retargeting, or by assuming the geometric distortions to be small that can be countered by translation-robust filters or by explicit image registrations. Here we rethink this problem and propose a unified, non-training-based Deep Structural Similarity (DeepSSIM) approach to address the above problems in a single framework, which assesses structural similarity of deep features in a simple but efficient way and uses an attention calibration strategy to alleviate attention deviation. The proposed method, without application-specific design, achieves state-of-the-art performance on AR-IQA datasets and meanwhile shows strong robustness to various GDR-IQA test cases. Interestingly, our test also shows the effectiveness of DeepSSIM as an optimization tool for training image super-resolution, enhancement and restoration, implying an even wider generalizability. \footnote{Source code will be made public after the review is completed.
