Subjective and Objective Quality Assessment Methods of Stereoscopic Videos with Visibility Affecting Distortions
Sria Biswas, Balasubramanyam Appina, Priyanka Kokil, Sumohana S Channappayya
TL;DR
This work addresses quality assessment of stereoscopic 3D videos under visibility-distorting conditions such as fog and haze by introducing the VAD stereo dataset (12 pristine references and 360 distorted stimuli) and an unsupervised completely blind NR QA model, CBSE. The CBSE method fuses binocular views into cyclopean frames, analyzes NSS via a multi-scale spherical steerable pyramid, fits UGGD parameters, and uses MVG modeling with Bhattacharyya distances to produce a final quality score through $CBSE = S_\mu \times S_{\sum}$. The dataset is evaluated with 24 human observers, producing DMOS that validate the subjective study, while CBSE is compared against a wide range of 2D/3D FR and NR IQA/VQA baselines across IRCCYN, LFOVIA Ph1/Ph2, and VAD datasets, showing competitive or superior performance without training. The work advances S3D QoE research by providing a richly annotated VAD dataset and a robust, unsupervised NR metric with potential extensions to VR/AR content.
Abstract
We present two major contributions in this work: 1) we create a full HD resolution stereoscopic (S3D) video dataset comprised of 12 reference and 360 distorted videos. The test stimuli are produced by simulating the five levels of fog and haze ambiances on the pristine left and right video sequences. We perform subjective analysis on the created video dataset with 24 viewers and compute Difference Mean Opinion Scores (DMOS) as quality representative of the dataset, 2) an Opinion Unaware (OU) and Distortion Unaware (DU) video quality assessment model is developed for S3D videos. We construct cyclopean frames from the individual views of an S3D video and partition them into nonoverlapping blocks. We analyze the Natural Scene Statistics (NSS) of all patches of pristine and test videos, and empirically model the NSS features with Univariate Generalized Gaussian Distribution (UGGD). We compute UGGD model parameters (α, \b{eta}) at multiple spatial scales and multiple orientations of spherical steerable pyramid decomposition and show that the UGGD parameters are distortion discriminable. Further, we perform Multivariate Gaussian (MVG) modeling on the pristine and distorted video feature sets and compute the corresponding mean vectors and covariance matrices of MVG fits. We compute the Bhattacharyya distance measure between mean vectors and covariance matrices to estimate the perceptual deviation of a test video from pristine video set. Finally, we pool both distance measures to estimate the overall quality score of an S3D video. The performance of the proposed objective algorithm is verified on the popular S3D video datasets such as IRCCYN, LFOVIAS3DPh1, LFOVIAS3DPh2 and the proposed VAD stereo dataset. The algorithm delivers consistent performance across all datasets and shows competitive performance against off-the-shelf 2D and 3D image and video quality assessment algorithms.
