What Makes for a Good Stereoscopic Image?
Netanel Y. Tamir, Shir Amir, Ranel Itzhaky, Noam Atia, Shobhita Sundaram, Stephanie Fu, Ron Sokolovsky, Phillip Isola, Tali Dekel, Richard Zhang, Miriam Farber
TL;DR
This work addresses the lack of holistic, VR-specific SQoE evaluation for stereoscopic content by introducing the SCOPE dataset and the iSQoE predictor. SCOPE comprises 2400 stereo-image samples with diverse distortions and 2AFC human annotations gathered on VR headsets, enabling training of a holistic SQoE model. The iSQoE architecture leverages cross-attention between left and right image backbones, Siamese training with a hinge loss, and LoRA-finetuned DINOv2, showing superior alignment with human preferences over existing SIQA/IQA baselines and robust extrapolation to unseen distortions. The findings underscore the necessity of VR-centric annotations, reveal cross-device variability in perception, and demonstrate practical utility for evaluating mono-to-stereo generation methods in immersive environments.
Abstract
With rapid advancements in virtual reality (VR) headsets, effectively measuring stereoscopic quality of experience (SQoE) has become essential for delivering immersive and comfortable 3D experiences. However, most existing stereo metrics focus on isolated aspects of the viewing experience such as visual discomfort or image quality, and have traditionally faced data limitations. To address these gaps, we present SCOPE (Stereoscopic COntent Preference Evaluation), a new dataset comprised of real and synthetic stereoscopic images featuring a wide range of common perceptual distortions and artifacts. The dataset is labeled with preference annotations collected on a VR headset, with our findings indicating a notable degree of consistency in user preferences across different headsets. Additionally, we present iSQoE, a new model for stereo quality of experience assessment trained on our dataset. We show that iSQoE aligns better with human preferences than existing methods when comparing mono-to-stereo conversion methods.
