Adapting Pretrained Networks for Image Quality Assessment on High Dynamic Range Displays
Andrei Chubarau, Hyunjin Yoo, Tara Akhavan, James Clark
TL;DR
The paper tackles HDR IQA by re-targeting SDR-trained networks to HDR content using perceptually uniform PU encodings and a domain adaptation framework. The authors propose a training recipe that pre-trains on SDR data, fine-tunes on PU-encoded HDR data, and optionally applies CORAL-based domain adaptation to align SDR and HDR representations, enabling effective transfer with limited HDR data. Empirical results on SDR/HDR benchmarks show faster convergence and improved HDR generalization, with notable gains from using CORAL and synthetic HDR-like data. This approach provides a practical path to leveraging abundant SDR data for HDR-IQA and suggests applicability to other HDR vision tasks. The main contributions include a detailed PU encoding normalization study, a CORAL-based DA strategy for HDR transfer, and demonstrated performance gains on UPIQ and related datasets.
Abstract
Conventional image quality metrics (IQMs), such as PSNR and SSIM, are designed for perceptually uniform gamma-encoded pixel values and cannot be directly applied to perceptually non-uniform linear high-dynamic-range (HDR) colors. Similarly, most of the available datasets consist of standard-dynamic-range (SDR) images collected in standard and possibly uncontrolled viewing conditions. Popular pre-trained neural networks are likewise intended for SDR inputs, restricting their direct application to HDR content. On the other hand, training HDR models from scratch is challenging due to limited available HDR data. In this work, we explore more effective approaches for training deep learning-based models for image quality assessment (IQA) on HDR data. We leverage networks pre-trained on SDR data (source domain) and re-target these models to HDR (target domain) with additional fine-tuning and domain adaptation. We validate our methods on the available HDR IQA datasets, demonstrating that models trained with our combined recipe outperform previous baselines, converge much quicker, and reliably generalize to HDR inputs.
