Table of Contents
Fetching ...

Depth estimation from 4D light field videos

Takahiro Kinoshita, Satoshi Ono

TL;DR

This work tackles depth estimation from 4D light field videos by leveraging temporal information that static LF methods overlook. It introduces an end-to-end model that combines two-stream 3D CNNs for spatial-angular feature extraction with CLSTM for temporal modeling, trained on a Sintel-based synthetic 4D LFV dataset. Experiments on synthetic and real LFVs show that incorporating temporal information improves depth estimation, especially in noisy regions, and the method outperforms a baseline lacking temporal cues. The authors provide a synthetic dataset and code to facilitate further research, highlighting the practical impact of depth-from-4D-LFV in challenging imaging conditions.

Abstract

Depth (disparity) estimation from 4D Light Field (LF) images has been a research topic for the last couple of years. Most studies have focused on depth estimation from static 4D LF images while not considering temporal information, i.e., LF videos. This paper proposes an end-to-end neural network architecture for depth estimation from 4D LF videos. This study also constructs a medium-scale synthetic 4D LF video dataset that can be used for training deep learning-based methods. Experimental results using synthetic and real-world 4D LF videos show that temporal information contributes to the improvement of depth estimation accuracy in noisy regions. Dataset and code is available at: https://mediaeng-lfv.github.io/LFV_Disparity_Estimation

Depth estimation from 4D light field videos

TL;DR

This work tackles depth estimation from 4D light field videos by leveraging temporal information that static LF methods overlook. It introduces an end-to-end model that combines two-stream 3D CNNs for spatial-angular feature extraction with CLSTM for temporal modeling, trained on a Sintel-based synthetic 4D LFV dataset. Experiments on synthetic and real LFVs show that incorporating temporal information improves depth estimation, especially in noisy regions, and the method outperforms a baseline lacking temporal cues. The authors provide a synthetic dataset and code to facilitate further research, highlighting the practical impact of depth-from-4D-LFV in challenging imaging conditions.

Abstract

Depth (disparity) estimation from 4D Light Field (LF) images has been a research topic for the last couple of years. Most studies have focused on depth estimation from static 4D LF images while not considering temporal information, i.e., LF videos. This paper proposes an end-to-end neural network architecture for depth estimation from 4D LF videos. This study also constructs a medium-scale synthetic 4D LF video dataset that can be used for training deep learning-based methods. Experimental results using synthetic and real-world 4D LF videos show that temporal information contributes to the improvement of depth estimation accuracy in noisy regions. Dataset and code is available at: https://mediaeng-lfv.github.io/LFV_Disparity_Estimation

Paper Structure

This paper contains 13 sections, 1 equation, 7 figures.

Figures (7)

  • Figure 1: 4D LFI and corresponding horizontal and vertical EPIs.
  • Figure 2: The ground-truth depth (disparity) map corresponding to central view of 4D LFI.
  • Figure 4: The proposed architecture.
  • Figure 5: CLSTM structure of the proposed method.
  • Figure 6: The estimation results of the proposed method and the baseline method faluvegi20193d.
  • ...and 2 more figures