Light Field Image Quality Assessment With Auxiliary Learning Based on Depthwise and Anglewise Separable Convolutions
Qiang Qu, Xiaoming Chen, Vera Chung, Zhibo Chen
TL;DR
The paper tackles NR-LFIQA for immersive light-field content by introducing ALAS-DADS, a framework that fuses light-field spatial and angular features via LF-DSC and LF-ASC, guided by two auxiliary tasks that estimate spatial NSS and angular GDD. This approach achieves superior no-reference quality prediction on Win5-LID and SMART, outperforming FR-IQA and prior NR-LFIQA methods with significant RMSE reductions and correlation gains. The key contributions are the theoretical extension of depthwise and anglewise separable convolutions to light fields, the auxiliary-learning scheme with spatial-angular hints, and demonstrated robustness across distortion types. The work promises practical impact for QoE-aware immersive media transmission and offers a foundation for extending efficient LF processing to related tasks like super-resolution and depth estimation.
Abstract
In multimedia broadcasting, no-reference image quality assessment (NR-IQA) is used to indicate the user-perceived quality of experience (QoE) and to support intelligent data transmission while optimizing user experience. This paper proposes an improved no-reference light field image quality assessment (NR-LFIQA) metric for future immersive media broadcasting services. First, we extend the concept of depthwise separable convolution (DSC) to the spatial domain of light field image (LFI) and introduce "light field depthwise separable convolution (LF-DSC)", which can extract the LFI's spatial features efficiently. Second, we further theoretically extend the LF-DSC to the angular space of LFI and introduce the novel concept of "light field anglewise separable convolution (LF-ASC)", which is capable of extracting both the spatial and angular features for comprehensive quality assessment with low complexity. Third, we define the spatial and angular feature estimations as auxiliary tasks in aiding the primary NR-LFIQA task by providing spatial and angular quality features as hints. To the best of our knowledge, this work is the first exploration of deep auxiliary learning with spatial-angular hints on NR-LFIQA. Experiments were conducted in mainstream LFI datasets such as Win5-LID and SMART with comparisons to the mainstream full reference IQA metrics as well as the state-of-the-art NR-LFIQA methods. The experimental results show that the proposed metric yields overall 42.86% and 45.95% smaller prediction errors than the second-best benchmarking metric in Win5-LID and SMART, respectively. In some challenging cases with particular distortion types, the proposed metric can reduce the errors significantly by more than 60%.
