Table of Contents
Fetching ...

Deep Spectral Epipolar Representations for Dense Light Field Reconstruction

Noor Islam S. Mohammad

TL;DR

This work tackles the challenge of dense depth estimation from 4D light field data by introducing Deep Spectral Epipolar Representation (DSER), a multi-stage framework that fuses deep spectral features with epipolar-domain regularization to enforce global structural coherence in disparity maps. It combines Least Squares Gradient (LSG), Plane Sweeping, Epipolar-Plane Image with Fine-to-Coarse Refinement (EPI-FCR), and a Directed Random Walk (DRW) refinement to produce accurate depth estimates while keeping runtimes practical. The approach leverages edge confidence, radiance sampling, and adaptive cost-aggregation to achieve high fidelity across textures and occlusions, with PSNRs approaching 33 dB on challenging datasets and robust boundary preservation. The results demonstrate that epipolar-domain priors and multiscale refinement can deliver scalable, noise-resilient dense light field depth estimation suitable for real-time or near-real-time pattern recognition and 3D scene understanding.

Abstract

Accurate and efficient dense depth reconstruction from light field imagery remains a central challenge in computer vision, underpinning applications such as augmented reality, biomedical imaging, and 3D scene reconstruction. Existing deep convolutional approaches, while effective, often incur high computational overhead and are sensitive to noise and disparity inconsistencies in real-world scenarios. This paper introduces a novel Deep Spectral Epipolar Representation (DSER) framework for dense light field reconstruction, which unifies deep spectral feature learning with epipolar-domain regularization. The proposed approach exploits frequency-domain correlations across epipolar plane images to enforce global structural coherence, thereby mitigating artifacts and enhancing depth accuracy. Unlike conventional supervised models, DSER operates efficiently with limited training data while maintaining high reconstruction fidelity. Comprehensive experiments on the 4D Light Field Benchmark and a diverse set of real-world datasets demonstrate that DSER achieves superior performance in terms of precision, structural consistency, and computational efficiency compared to state-of-the-art methods. These results highlight the potential of integrating spectral priors with epipolar geometry for scalable and noise-resilient dense light field depth estimation, establishing DSER as a promising direction for next-generation high-dimensional vision systems.

Deep Spectral Epipolar Representations for Dense Light Field Reconstruction

TL;DR

This work tackles the challenge of dense depth estimation from 4D light field data by introducing Deep Spectral Epipolar Representation (DSER), a multi-stage framework that fuses deep spectral features with epipolar-domain regularization to enforce global structural coherence in disparity maps. It combines Least Squares Gradient (LSG), Plane Sweeping, Epipolar-Plane Image with Fine-to-Coarse Refinement (EPI-FCR), and a Directed Random Walk (DRW) refinement to produce accurate depth estimates while keeping runtimes practical. The approach leverages edge confidence, radiance sampling, and adaptive cost-aggregation to achieve high fidelity across textures and occlusions, with PSNRs approaching 33 dB on challenging datasets and robust boundary preservation. The results demonstrate that epipolar-domain priors and multiscale refinement can deliver scalable, noise-resilient dense light field depth estimation suitable for real-time or near-real-time pattern recognition and 3D scene understanding.

Abstract

Accurate and efficient dense depth reconstruction from light field imagery remains a central challenge in computer vision, underpinning applications such as augmented reality, biomedical imaging, and 3D scene reconstruction. Existing deep convolutional approaches, while effective, often incur high computational overhead and are sensitive to noise and disparity inconsistencies in real-world scenarios. This paper introduces a novel Deep Spectral Epipolar Representation (DSER) framework for dense light field reconstruction, which unifies deep spectral feature learning with epipolar-domain regularization. The proposed approach exploits frequency-domain correlations across epipolar plane images to enforce global structural coherence, thereby mitigating artifacts and enhancing depth accuracy. Unlike conventional supervised models, DSER operates efficiently with limited training data while maintaining high reconstruction fidelity. Comprehensive experiments on the 4D Light Field Benchmark and a diverse set of real-world datasets demonstrate that DSER achieves superior performance in terms of precision, structural consistency, and computational efficiency compared to state-of-the-art methods. These results highlight the potential of integrating spectral priors with epipolar geometry for scalable and noise-resilient dense light field depth estimation, establishing DSER as a promising direction for next-generation high-dimensional vision systems.

Paper Structure

This paper contains 26 sections, 19 equations, 8 figures, 3 tables, 3 algorithms.

Figures (8)

  • Figure 1: Block diagram of fully convolutional networks showing AE without skip connections and U-Net with skip connections. Encoder and decoder blocks include batch normalization, ReLU, and convolution layers. 'Conv(s2)' and 'ConvT(s2)' denote convolution and transposed convolution with a stride 2, indicating concatenation of encoder and decoder feature maps along the channel dimensionref32.
  • Figure 2: Depths vs. Run Times (in seconds)
  • Figure 3: Algorithmic pipeline for dense light field depth estimation. The process begins with rectified light field input, followed by disparity hypothesis generation and homographic warping to synthesize plane-specific views. A cost volume is constructed from robust photometric similarity metrics, and global optimization yields an initial disparity map and dense depth mapref33.
  • Figure 4: Depth Map Algorithm Comparisons Using the Heidelberg Dataset: Qualitative evaluation of depth estimation performance across diverse algorithms and scene types, highlighting differences in boundary precision, occlusion handling, and depth continuity.
  • Figure 5: Per-pixel depth estimation error maps for Boxes, Dino, and Cotton scenes across LSG, Plane Sweeping, and EPI methods. Brighter intensities indicate higher depth errors. EPI variants improve geometric accuracy and edge preservation, outperforming baseline approaches in textured and textureless regions.
  • ...and 3 more figures