Table of Contents
Fetching ...

ARIQA-3DS: A Stereoscopic Image Quality Assessment Dataset for Realistic Augmented Reality

Aymen Sekhri, Seyed Ali Amirshahi, Mohamed-Chaker Larabi

Abstract

As Augmented Reality (AR) technologies advance towards immersive consumer adoption, the need for rigorous Quality of Experience (QoE) assessment becomes critical. However, existing datasets often lack ecological validity, relying on monocular viewing or simplified backgrounds that fail to capture the complex perceptual interplay, termed visual confusion, between real and virtual layers. To address this gap, we present ARIQA-3DS, the first large stereoscopic AR Image Quality Assessment dataset. Comprising 1,200 AR viewports, the dataset fuses high-resolution stereoscopic omnidirectional captures of real-world scenes with diverse augmented foregrounds under controlled transparency and degradation conditions. We conducted a comprehensive subjective study with 36 participants using a video see-through head-mounted display, collecting both quality ratings and simulator-sickness indicators. Our analysis reveals that perceived quality is primarily driven by foreground degradations and modulated by transparency levels, while oculomotor and disorientation symptoms show a progressive but manageable increase during viewing. ARIQA-3DS will be publicly released to serve as a comprehensive benchmark for developing next-generation AR quality assessment models.

ARIQA-3DS: A Stereoscopic Image Quality Assessment Dataset for Realistic Augmented Reality

Abstract

As Augmented Reality (AR) technologies advance towards immersive consumer adoption, the need for rigorous Quality of Experience (QoE) assessment becomes critical. However, existing datasets often lack ecological validity, relying on monocular viewing or simplified backgrounds that fail to capture the complex perceptual interplay, termed visual confusion, between real and virtual layers. To address this gap, we present ARIQA-3DS, the first large stereoscopic AR Image Quality Assessment dataset. Comprising 1,200 AR viewports, the dataset fuses high-resolution stereoscopic omnidirectional captures of real-world scenes with diverse augmented foregrounds under controlled transparency and degradation conditions. We conducted a comprehensive subjective study with 36 participants using a video see-through head-mounted display, collecting both quality ratings and simulator-sickness indicators. Our analysis reveals that perceived quality is primarily driven by foreground degradations and modulated by transparency levels, while oculomotor and disorientation symptoms show a progressive but manageable increase during viewing. ARIQA-3DS will be publicly released to serve as a comprehensive benchmark for developing next-generation AR quality assessment models.

Paper Structure

This paper contains 24 sections, 2 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Plots of Spatial Information (SI) and Colorfulness (CF) for the twenty stereoscopic $360^{\circ}$ background images (orange: indoor scenes, blue: outdoor scenes). (a) SI index shows a wide range of structural complexity; (b) CF metric varies from low to high color saturation; (c) the joint distribution indicates that the selected images span a broad region of the SI $\times$ CF space.
  • Figure 2: Plots of Spatial Information (SI) and Colorfulness (CF) for the sixty foreground objects (blue: Graphical, orange: Natural, green: Screenshots). The three semantic categories occupy distinct regions in the SI $\times$ CF feature space, highlighting the diversity of structural textures and chromatic properties represented in the dataset.
  • Figure 3: Representative stereoscopic $360^{\circ}$ background images captured with Insta360 Pro 1 and Insta360 Pro 2 cameras in indoor and outdoor scenes from Poitiers (France), and Gjøvik (Norway). The reference images are shown alongside distorted versions generated using two color saturation levels (C1, C2) and two HEVC compression levels (H1, H2).
  • Figure 4: Virtual foreground objects included in ARIQA-3DS, spanning three semantic categories: Graphical (3D Paint Microsoft and Sketchfab), Natural (Pexels), and Screenshot.
  • Figure 5: Illustration of AR simulation within the VR environment. Three foregrounds superimposed on a stereoscopic background. The subject focuses on a perceptual viewport, automatically shifting to the next Foreground after each rating.
  • ...and 6 more figures