Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays

Zhenxing Dong; Jidong Jia; Yan Li; Yuye Ling

Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays

Zhenxing Dong, Jidong Jia, Yan Li, Yuye Ling

TL;DR

The paper tackles the memory bottleneck in deep-learning CGH for ultra-high-definition holograms by introducing a divide-conquer-and-merge strategy that splits inputs into $r^2$ sub-images, processes them with phase-generator/phase-encoder branches, and merges them via upsampling to form full-resolution holograms. Central to the approach is a lightweight holographic SR network (LFMN) with Local Feature Modulation and Enhanced Convolutional Channel Mixer, designed to preserve quality while reducing memory usage; the method is extensible through a recursive pyramid form for even larger scales. Empirical results show substantial training memory reductions (e.g., 64.3% for HoloNet and 12.9% for CCNNs) and inference speedups (up to 3× and 2×), enabling 8K holograms in simulations and verified by full-color optical experiments. The framework demonstrates significant practical impact by enabling high-definition holographic displays on consumer GPUs and provides a path toward 16K+ holograms through further architectural integration and memory-aware design.

Abstract

Recently, deep learning-based computer-generated holography (CGH) has demonstrated tremendous potential in three-dimensional (3D) displays and yielded impressive display quality. However, most existing deep learning-based CGH techniques can only generate holograms of 1080p resolution, which is far from the ultra-high resolution (16K+) required for practical virtual reality (VR) and augmented reality (AR) applications to support a wide field of view and large eye box. One of the major obstacles in current CGH frameworks lies in the limited memory available on consumer-grade GPUs which could not facilitate the generation of higher-definition holograms. To overcome the aforementioned challenge, we proposed a divide-conquer-and-merge strategy to address the memory and computational capacity scarcity in ultra-high-definition CGH generation. This algorithm empowers existing CGH frameworks to synthesize higher-definition holograms at a faster speed while maintaining high-fidelity image display quality. Both simulations and experiments were conducted to demonstrate the capabilities of the proposed framework. By integrating our strategy into HoloNet and CCNNs, we achieved significant reductions in GPU memory usage during the training period by 64.3\% and 12.9\%, respectively. Furthermore, we observed substantial speed improvements in hologram generation, with an acceleration of up to 3$\times$ and 2 $\times$, respectively. Particularly, we successfully trained and inferred 8K definition holograms on an NVIDIA GeForce RTX 3090 GPU for the first time in simulations. Furthermore, we conducted full-color optical experiments to verify the effectiveness of our method. We believe our strategy can provide a novel approach for memory- and time-efficient holographic displays.

Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays

TL;DR

The paper tackles the memory bottleneck in deep-learning CGH for ultra-high-definition holograms by introducing a divide-conquer-and-merge strategy that splits inputs into

sub-images, processes them with phase-generator/phase-encoder branches, and merges them via upsampling to form full-resolution holograms. Central to the approach is a lightweight holographic SR network (LFMN) with Local Feature Modulation and Enhanced Convolutional Channel Mixer, designed to preserve quality while reducing memory usage; the method is extensible through a recursive pyramid form for even larger scales. Empirical results show substantial training memory reductions (e.g., 64.3% for HoloNet and 12.9% for CCNNs) and inference speedups (up to 3× and 2×), enabling 8K holograms in simulations and verified by full-color optical experiments. The framework demonstrates significant practical impact by enabling high-definition holographic displays on consumer GPUs and provides a path toward 16K+ holograms through further architectural integration and memory-aware design.

Abstract

and 2

, respectively. Particularly, we successfully trained and inferred 8K definition holograms on an NVIDIA GeForce RTX 3090 GPU for the first time in simulations. Furthermore, we conducted full-color optical experiments to verify the effectiveness of our method. We believe our strategy can provide a novel approach for memory- and time-efficient holographic displays.

Paper Structure (28 sections, 3 equations, 8 figures, 4 tables)

This paper contains 28 sections, 3 equations, 8 figures, 4 tables.

Introduction
Related Works
Computer-generated Holography
Image Super-resolution
Proposed Method
Phase Generator
Phase Encoder
Lightweight Holographic SR Network
Local Feature Modulation
Enhanced Convolutional Channel Mixer
Recursive form
Results and Analysis
Implementation Details
Datasets
Evaluation metrics
...and 13 more sections

Figures (8)

Figure 1: An overview of our proposed framework. The method can be divided into two parts, which are inserted into the phase generator and phase encoder sections of a CGH generation network, respectively. For the phase generator part, the module first performs a pixel-unshuffle operation on an image of size $H\times W$ to get $r^2$ sub-images of size $H/r\times W/r$. Next, these sub-images are fed into the phase generator of the original network to predict $r^2$ phase sub-images, and then upsamples phase sub-images into a phase image of size $H\times W$ by a pixel-shuffle layer. For the phase encoder part, the module is similar to the phase generator part, moreover for the upsampling step a lightweight SR network is used as a replacement for the pixel-shuffle layer to strengthen the quality of the generated hologram. The Big Buck Bunny image comes from www.bigbuckbunny.org (© 2008, Blender Foundation) under the Creative Commons Attribution 3.0 license (https://creativecommons.org/licenses/by/3.0/).
Figure 2: An overview of our proposed light-weighted holographic SR network, namely local feature mixing network (LFMN). The core module of LFMN is the local feature mixing module (LFMM) comprising a local feature modulation (LFM) module and a modified convolutional channel mixer (CCM) SAFMN enhanced for local feature extraction.
Figure 3: Pyramid SR Network. We design a two-stage SR architecture to synthesize a large-scale CGH.
Figure 4: Numerical reconstruction. Full-color numerical simulations of holograms at 1080p definition generated by different methods. The second image comes from www.bigbuckbunny.org (© 2008, Blender Foundation) under the Creative Commons Attribution 3.0 license (https://creativecommons.org/licenses/by/3.0/). The third image comes from kim2013scene.
Figure 5: Numerical simulations of holograms at 4K definition generated by different methods. These images come from UHD8K UHD8K.
...and 3 more figures

Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays

TL;DR

Abstract

Divide-Conquer-and-Merge: Memory- and Time-Efficient Holographic Displays

Authors

TL;DR

Abstract

Table of Contents

Figures (8)