Non-Aligned Reference Image Quality Assessment for Novel View Synthesis
Abhijay Ghildyal, Rajesh Sureddi, Nabajeet Barman, Saman Zadtootaghaj, Alan Bovik
TL;DR
The paper tackles perceptual quality assessment for novel view synthesis when pixel-aligned references are unavailable. It introduces Non-Aligned Reference IQA (NAR-IQA) and the NOVA model, trained with synthetic, localized distortions within Temporal Regions of Interest and guided by contrastive learning on a LoRA-enhanced DINOv2 backbone. By combining IQA model supervision with KL regularization and carefully curated triplets, NOVA achieves state-of-the-art performance on both aligned and non-aligned reference settings and demonstrates strong correlation with human judgments on NVS benchmarks. The work also provides a large, synthetic training dataset, a comprehensive NVS NAR-IQA benchmark, and supplementary visualizations to aid interpretability, underscoring practical impact for real-world NVS QA where aligned references are scarce.
Abstract
Evaluating the perceptual quality of Novel View Synthesis (NVS) images remains a key challenge, particularly in the absence of pixel-aligned ground truth references. Full-Reference Image Quality Assessment (FR-IQA) methods fail under misalignment, while No-Reference (NR-IQA) methods struggle with generalization. In this work, we introduce a Non-Aligned Reference (NAR-IQA) framework tailored for NVS, where it is assumed that the reference view shares partial scene content but lacks pixel-level alignment. We constructed a large-scale image dataset containing synthetic distortions targeting Temporal Regions of Interest (TROI) to train our NAR-IQA model. Our model is built on a contrastive learning framework that incorporates LoRA-enhanced DINOv2 embeddings and is guided by supervision from existing IQA methods. We train exclusively on synthetically generated distortions, deliberately avoiding overfitting to specific real NVS samples and thereby enhancing the model's generalization capability. Our model outperforms state-of-the-art FR-IQA, NR-IQA, and NAR-IQA methods, achieving robust performance on both aligned and non-aligned references. We also conducted a novel user study to gather data on human preferences when viewing non-aligned references in NVS. We find strong correlation between our proposed quality prediction model and the collected subjective ratings. For dataset and code, please visit our project page: https://stootaghaj.github.io/nova-project/
