Table of Contents
Fetching ...

P-Hologen: An End-to-End Generative Framework for Phase-Only Holograms

JooHyun Park, YuJin Jeon, HuiYong Kim, SeungHwan Baek, HyeongYeop Kang

TL;DR

P-Hologen tackles the challenge of generative modeling for phase-only holograms by learning discrete image–phase latents with a VQ-VAE and using the angular spectrum method during training to enable end-to-end POH generation. The approach leverages image-domain training data, a discrete latent space, and a PixelSnail sampler to produce unseen POHs with improved reconstruction quality and computational efficiency compared to two-stage pipelines. Empirical results show that VQ-VAE outperforms standard VAE for image-to-POH tasks, that a 9:1 mix of MSE and perceptual loss yields the best trade-off between pixel fidelity and perceptual quality, and that P-Hologen achieves superior PSNR/SSIM and competitive FID relative to SSHM and Holonet baselines, while requiring fewer parameters and faster decoding. Optical reconstructions on real hardware validate the practical viability of the generated POHs, highlighting the method’s potential for real-world holographic content creation and editing, with future work aimed at handling variable propagation distances and depth cues.

Abstract

Holography stands at the forefront of visual technology, offering immersive, three-dimensional visualizations through the manipulation of light wave amplitude and phase. Although generative models have been extensively explored in the image domain, their application to holograms remains relatively underexplored due to the inherent complexity of phase learning. Exploiting generative models for holograms offers exciting opportunities for advancing innovation and creativity, such as semantic-aware hologram generation and editing. Currently, the most viable approach for utilizing generative models in the hologram domain involves integrating an image-based generative model with an image-to-hologram conversion model, which comes at the cost of increased computational complexity and inefficiency. To tackle this problem, we introduce P-Hologen, the first end-to-end generative framework designed for phase-only holograms (POHs). P-Hologen employs vector quantized variational autoencoders to capture the complex distributions of POHs. It also integrates the angular spectrum method into the training process, constructing latent spaces for complex phase data using strategies from the image processing domain. Extensive experiments demonstrate that P-Hologen achieves superior quality and computational efficiency compared to the existing methods. Furthermore, our model generates high-quality unseen, diverse holographic content from its learned latent space without requiring pre-existing images. Our work paves the way for new applications and methodologies in holographic content creation, opening a new era in the exploration of generative holographic content. The code for our paper is publicly available on https://github.com/james0223/P-Hologen.

P-Hologen: An End-to-End Generative Framework for Phase-Only Holograms

TL;DR

P-Hologen tackles the challenge of generative modeling for phase-only holograms by learning discrete image–phase latents with a VQ-VAE and using the angular spectrum method during training to enable end-to-end POH generation. The approach leverages image-domain training data, a discrete latent space, and a PixelSnail sampler to produce unseen POHs with improved reconstruction quality and computational efficiency compared to two-stage pipelines. Empirical results show that VQ-VAE outperforms standard VAE for image-to-POH tasks, that a 9:1 mix of MSE and perceptual loss yields the best trade-off between pixel fidelity and perceptual quality, and that P-Hologen achieves superior PSNR/SSIM and competitive FID relative to SSHM and Holonet baselines, while requiring fewer parameters and faster decoding. Optical reconstructions on real hardware validate the practical viability of the generated POHs, highlighting the method’s potential for real-world holographic content creation and editing, with future work aimed at handling variable propagation distances and depth cues.

Abstract

Holography stands at the forefront of visual technology, offering immersive, three-dimensional visualizations through the manipulation of light wave amplitude and phase. Although generative models have been extensively explored in the image domain, their application to holograms remains relatively underexplored due to the inherent complexity of phase learning. Exploiting generative models for holograms offers exciting opportunities for advancing innovation and creativity, such as semantic-aware hologram generation and editing. Currently, the most viable approach for utilizing generative models in the hologram domain involves integrating an image-based generative model with an image-to-hologram conversion model, which comes at the cost of increased computational complexity and inefficiency. To tackle this problem, we introduce P-Hologen, the first end-to-end generative framework designed for phase-only holograms (POHs). P-Hologen employs vector quantized variational autoencoders to capture the complex distributions of POHs. It also integrates the angular spectrum method into the training process, constructing latent spaces for complex phase data using strategies from the image processing domain. Extensive experiments demonstrate that P-Hologen achieves superior quality and computational efficiency compared to the existing methods. Furthermore, our model generates high-quality unseen, diverse holographic content from its learned latent space without requiring pre-existing images. Our work paves the way for new applications and methodologies in holographic content creation, opening a new era in the exploration of generative holographic content. The code for our paper is publicly available on https://github.com/james0223/P-Hologen.
Paper Structure (18 sections, 7 equations, 8 figures, 5 tables)

This paper contains 18 sections, 7 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: During the training phase, P-Hologen learns image-phase latent representations from an image dataset. During inference, POHs can be generated by sampling the image-phase latent space. These sampled POHs are then propagated using the ASM to reconstruct previously unseen images.
  • Figure 2: Overall architecture of P-Hologen
  • Figure 3: Visualization of the latent spaces trained on four different approaches using the MNIST dataset. The rows represent the architecture, and the columns represent the task. The images inside the dashed box represent the novel data instances sampled from each latent space.
  • Figure 4: Illustration of POH reconstructions using different loss functions: (a) $L_{\mathrm{mse}}$ only; (b) a 9:1 ratio of $L_{\mathrm{mse}}$ and $L_{\mathrm{per}}$; (c) $L_{\mathrm{per}}$ only. Each example shows a single channel to enhance noise artifact visualization.
  • Figure 5: POHs and their reconstructions of each model using GT images from CelebA-HQ and AF-C datasets.
  • ...and 3 more figures