RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent

Yijie Deng; Lei Han; Tianpeng Lin; Lin Li; Jinzhi Zhang; Lu Fang

RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent

Yijie Deng, Lei Han, Tianpeng Lin, Lin Li, Jinzhi Zhang, Lu Fang

TL;DR

RealLiFe addresses the challenge of real-time light-field reconstruction from sparse views for XR displays. It introduces Hierarchical Sparse Gradient Descent (HSGD), a coarse-to-fine optimization that sparsifies MPI gradients to focus computation on the most informative planes, coupled with an occlusion-aware refinement. The method achieves MPI generation at around 35 FPS and novel-view rendering at about 700 FPS, offering roughly 100x speedups over offline approaches while maintaining or surpassing online-method visual quality (≈2 dB PSNR improvement on several datasets). The combination of plane-sweep-based initial MPI generation, sparse-gradient refinement, and an occlusion-aware module enables robust, real-time light-field reconstruction suitable for naked-eye 3D displays and XR pipelines.

Abstract

With the rise of Extended Reality (XR) technology, there is a growing need for real-time light field generation from sparse view inputs. Existing methods can be classified into offline techniques, which can generate high-quality novel views but at the cost of long inference/training time, and online methods, which either lack generalizability or produce unsatisfactory results. However, we have observed that the intrinsic sparse manifold of Multi-plane Images (MPI) enables a significant acceleration of light field generation while maintaining rendering quality. Based on this insight, we introduce EffLiFe, a novel light field optimization method, which leverages the proposed Hierarchical Sparse Gradient Descent (HSGD) to produce high-quality light fields from sparse view images in real time. Technically, the coarse MPI of a scene is first generated using a 3D CNN, and it is further sparsely optimized by focusing only on important MPI gradients in a few iterations. Nevertheless, relying solely on optimization can lead to artifacts at occlusion boundaries. Therefore, we propose an occlusion-aware iterative refinement module that removes visual artifacts in occluded regions by iteratively filtering the input. Extensive experiments demonstrate that our method achieves comparable visual quality while being 100x faster on average than state-of-the-art offline methods and delivering better performance (about 2 dB higher in PSNR) compared to other online approaches.

RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent

TL;DR

Abstract

Paper Structure (19 sections, 17 equations, 10 figures, 3 tables)

This paper contains 19 sections, 17 equations, 10 figures, 3 tables.

Introduction
Related Work
High-quality Novel View Synthesis
Generalizable Novel View Synthesis
Real-time Novel View Synthesis
Light Field Reconstruction
Method
Preliminaries
Initial MPI Generation
Hierarchical Sparse Gradient Descent
Training
Implementation Details
Experiments
Baselines, datasets and metrics
Experiment Configurations
...and 4 more sections

Figures (10)

Figure 1: Rendering quality and efficiency comparison with state-of-the-art novel view synthesis methodsWang2021IBRNetLMChen2021MVSNeRFFGLin2021EfficientNR and light field reconstruction methodsSolovev2022SelfimprovingMI on Real Forward-FacingMildenhall2019LocalLF of image size $512\times 384$. RealLiFe is our default model with 3 iterations of gradient descent, and RealLiFe-2I is one with 2 iterations of gradient descent.
Figure 2: Application of our method to support a real-time 3D display. (a) Sparse multi-view images that serve as the input to our model. (b) Our Hierarchical Sparse Gradient Descent (HSGD) is capable of generating multi-plane images (c) online at around 35 FPS. (d) Novel views can then be rendered offline from a Multi-plane Image at approximately 700 FPS. (e) Several novel views rendered from the MPI provide enough light field information to a 3D display, specifically a looking glass that supports naked-eye 3D effects from a wide range of views. (GD is short for gradient descent.)
Figure 3: The overview of RealLiFe.(a) Initial MPI Generation: First, the PSV is constructed using multiview images by homographic warping, and the PSV is then downsampled hierarchically at multiple resolutions. The output MPI is then generated in several iterations. Initially, the lowest-resolution PSV is fed to a 3D CNN to extract a coarse-level MPI. (b) Hierarchical Sparse Gradient Descent: The PSV of multi-resolutions and the upsampled MPI both go through the Hierarchical Sparse Gradient Descent module for a refined higher-resolution MPI. Finally, a novel view can be easily rendered from the MPI using the repeated over operatorThomasKPorter1984CompositingDI.
Figure 4: The sparse gradient descent module. (a) The MPI gradients $\nabla$ comprises the input plane sweep volume $P$ and the warped alpha gradients $\mathcal{A}_i^w$ for each source view $i$. (b) The volume $V$ is composed of the MPI $M$ and the MPI gradients $\nabla$. It is sparsified by selecting the top $k$ voxels along the depth axis based on alpha gradients of the reference view. Simultaneously, the output sparse indices $S$ store the positions of the selected $k$ voxels along the depth axis. (c) The sparsified volume $V_s$ is fed to a 3D CNN (learned gradient descent) for a refiend sparse MPI residual. Finally, the sparse gradient update module utilizes the sparse indices $S$ to add the sparse MPI residual to the input multi-plane image $M$, resulting in a refined multi-plane image.
Figure 5: The influence of $k$ on the rendering quality. (a) A rendered image of RealLiFe (without using HSGD) with 40 MPI planes. (b) Top 5 MPI planes with the highest alpha gradients $\mathcal{A}$ for the red pixel. (The MPI is multiplied by $\mathcal{A}$ to better visualize its contribution to the final rendering result.) (c) The alpha gradients of the red pixel across 40 MPI planes, with the red dotted lines partitioning $d$ based on whether its alpha gradient falls within the top $k$. (d) The color ratio ($RGB_{k=3,5,7} / RGB_{k=40}$) and PSNR of the rendered red pixel in comparison to when $k=40$.
...and 5 more figures

RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent

TL;DR

Abstract

RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent

Authors

TL;DR

Abstract

Table of Contents

Figures (10)