Table of Contents
Fetching ...

Holographic Parallax Improves 3D Perceptual Realism

Dongyeon Kim, Seung-Woo Nam, Suyeon Choi, Jong-Mo Seo, Gordon Wetzstein, Yoonchan Jeong

TL;DR

The paper tackles how CGH supervision formats affect perceptual realism in holographic near-eye displays. It builds a full-color perceptual testbed and compares 2.5D RGB-D, 3D focal stacks, 4D light-field, and center-view CGH supervision under natural viewing. Results show that incorporating parallax cues via 4D light-field supervision yields the strongest 3D realism across eyebox conditions, outperforming viewpoint-specific formats. The findings provide design guidelines for perceptually realistic holographic VR/AR experiences and underscore the need for perceptual metrics that account for pupil dynamics and eye movements.

Abstract

Holographic near-eye displays are a promising technology to solve long-standing challenges in virtual and augmented reality display systems. Over the last few years, many different computer-generated holography (CGH) algorithms have been proposed that are supervised by different types of target content, such as 2.5D RGB-depth maps, 3D focal stacks, and 4D light fields. It is unclear, however, what the perceptual implications are of the choice of algorithm and target content type. In this work, we build a perceptual testbed of a full-color, high-quality holographic near-eye display. Under natural viewing conditions, we examine the effects of various CGH supervision formats and conduct user studies to assess their perceptual impacts on 3D realism. Our results indicate that CGH algorithms designed for specific viewpoints exhibit noticeable deficiencies in achieving 3D realism. In contrast, holograms incorporating parallax cues consistently outperform other formats across different viewing conditions, including the center of the eyebox. This finding is particularly interesting and suggests that the inclusion of parallax cues in CGH rendering plays a crucial role in enhancing the overall quality of the holographic experience. This work represents an initial stride towards delivering a perceptually realistic 3D experience with holographic near-eye displays.

Holographic Parallax Improves 3D Perceptual Realism

TL;DR

The paper tackles how CGH supervision formats affect perceptual realism in holographic near-eye displays. It builds a full-color perceptual testbed and compares 2.5D RGB-D, 3D focal stacks, 4D light-field, and center-view CGH supervision under natural viewing. Results show that incorporating parallax cues via 4D light-field supervision yields the strongest 3D realism across eyebox conditions, outperforming viewpoint-specific formats. The findings provide design guidelines for perceptually realistic holographic VR/AR experiences and underscore the need for perceptual metrics that account for pupil dynamics and eye movements.

Abstract

Holographic near-eye displays are a promising technology to solve long-standing challenges in virtual and augmented reality display systems. Over the last few years, many different computer-generated holography (CGH) algorithms have been proposed that are supervised by different types of target content, such as 2.5D RGB-depth maps, 3D focal stacks, and 4D light fields. It is unclear, however, what the perceptual implications are of the choice of algorithm and target content type. In this work, we build a perceptual testbed of a full-color, high-quality holographic near-eye display. Under natural viewing conditions, we examine the effects of various CGH supervision formats and conduct user studies to assess their perceptual impacts on 3D realism. Our results indicate that CGH algorithms designed for specific viewpoints exhibit noticeable deficiencies in achieving 3D realism. In contrast, holograms incorporating parallax cues consistently outperform other formats across different viewing conditions, including the center of the eyebox. This finding is particularly interesting and suggests that the inclusion of parallax cues in CGH rendering plays a crucial role in enhancing the overall quality of the holographic experience. This work represents an initial stride towards delivering a perceptually realistic 3D experience with holographic near-eye displays.
Paper Structure (14 sections, 10 equations, 8 figures, 1 table)

This paper contains 14 sections, 10 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Various CGH supervision targets (2.5D, 3D, 4D) for holographic displays to realize the natural volumetric scene (ground truth, GT). The reconstructed epipolar plane images (EPIs) of the individual data formats are provided to demonstrate the angle-dependent spatial information and the red-boxed regions are enlarged to demonstrate the differences. The EPIs are reconstructed with 25 horizontal views for the GT case, 5 planar images for 2.5D and 3D cases, and 5 view images for 4D case. Dragon, Bunny: credit to Stanford Computer Graphics Laboratory.
  • Figure 2: Holographic reconstruction with different CGH supervision targets: (A) Near-depth focused holographic images (2.5D (2nd col, focal slab), 3D w/ RGB-D (3rd col, focal stack with RGB-D), 3D w/ LF (4th col, focal stack with 25$\times$25 LF), 4D (5th col, 9$\times$9 LF)) of the landscape$\_$day scene are reconstructed in the full eyebox condition, respectively with the ground truth (GT) focal stack (1st col). (B) Enlargements of the corresponding holographic scenes reconstructed based on 7 different pupil settings (pupil displacement and size) are presented except the one with the fully vignetted (box with dashed line) condition. Each enlargement is consecutively provided with the quality metrics of PSNR, SSIM (maximum of 1), FovVideoVDP (JOD unit having a maximum of 10) mantiuk2021fovvideovdp evaluated with the GT focal stack. Here, the pupil displacement (${x}_{p,norm}$) presents the eye pupil's displacement (${x}_{p}$) in the horizontal axis and the pupil size (${D}_{p,norm}$) denotes the diameter of the human eye pupil (${D}_{p}$), and those values are normalized with the width of the eyebox (${w}_{eyebox}$, 2.2 mm). The enlargements with red arrows indicate scenes reconstructed under an overfilled pupil and those with green arrows denote images visualized under an underfilled pupil. (purchased Unity asset: Low Poly Series: Landscape)
  • Figure 3: 3D holographic perceptual testbed and stimuli. (A) We conduct the user study using the apparatus shown on the left. Holographic scenes are generated using various CGH methods, using targets rendered with scenes purchased from the Unity Asset Store (Fantastic-Village Pack)
  • Figure 4: Experimental results with different pupil positions. Holographic scenes supervised with 2D (1st col), 2.5D (2nd), 3D w/ RGB-D (3rd), 3D w/ LF (4th), 4D (5th) targets are captured with different pupil positions (red: $({x}_{p,norm},{D}_{p,norm})=(0,1.1)$, yellow: $(-0.68,1.1)$ and green: $(-1.36,1.1)$). The scenes are photographed with different focal states (landscape$\_$day: 7th, village: 7th) out of 9 distinct focal states equally sampled in diopter. Enlargements are provided with the image focused on the magnified object. The colors of each row for the enlargements indicate the pupil positions (red: center, yellow: decentered, green: vignetted). We intentionally provide the results without modifying the brightness to show the energy across the eyebox. Note that it is hard to discriminate 3D w/ LF case and 4D case with the captured results.
  • Figure 5: User experiment results: (A) 3D realism is assessed using CGHs supervised with four target formats (2.5D in yellow, 3D w/ RGB-D in green, 3D w/ LF in blue, 4D in red) across four viewing conditions (Center, Decentered, Vignetted, and with head movement). The mean JOD is set as zero for each viewing condition. Error bars represent 95$\%$ confidence intervals estimated by bootstrapping 500 samples. Asterisks (blue: 3D w/ LF vs. paired cases, red: 4D vs. other cases) indicate the statistical significance of differences (*: $p$<0.05, **: $p$<0.01, ***: $p$<0.001). (B) The tracked trajectory of the pupil center for one representative subject. Error bars represent the 95$\%$ confidence interval of the pupil displacement. (C) Measured pupil diameters of representative subjects depending on the viewing conditions. The black circle corresponds to the pupil diameter of individual subjects and the dashed line denotes the width of eyebox in our experimental setup.
  • ...and 3 more figures