Enhanced Event-Based Video Reconstruction with Motion Compensation

Siying Liu; Pier Luigi Dragotti

Enhanced Event-Based Video Reconstruction with Motion Compensation

Siying Liu, Pier Luigi Dragotti

TL;DR

This work proposes warping the input intensity frames and sparse codes to enhance reconstruction quality and achieves state-of-the-art reconstruction accuracy and simultaneously provides reliable dense flow estimation in a CISTA-Flow network.

Abstract

Deep neural networks for event-based video reconstruction often suffer from a lack of interpretability and have high memory demands. A lightweight network called CISTA-LSTC has recently been introduced showing that high-quality reconstruction can be achieved through the systematic design of its architecture. However, its modelling assumption that input signals and output reconstructed frame share the same sparse representation neglects the displacement caused by motion. To address this, we propose warping the input intensity frames and sparse codes to enhance reconstruction quality. A CISTA-Flow network is constructed by integrating a flow network with CISTA-LSTC for motion compensation. The system relies solely on events, in which predicted flow aids in reconstruction and then reconstructed frames are used to facilitate flow estimation. We also introduce an iterative training framework for this combined system. Results demonstrate that our approach achieves state-of-the-art reconstruction accuracy and simultaneously provides reliable dense flow estimation. Furthermore, our model exhibits flexibility in that it can integrate different flow networks, suggesting its potential for further performance enhancement.

Enhanced Event-Based Video Reconstruction with Motion Compensation

TL;DR

Abstract

Paper Structure (23 sections, 9 equations, 7 figures, 3 tables)

This paper contains 23 sections, 9 equations, 7 figures, 3 tables.

Introduction
Methodology
Overview of CISTA-LSTC Network
Overview of CISTA-LSTC
Motion compensation for input frame and sparse codes
Events-to-Video Reconstruction with Motion Compensation
Flow estimation using reconstructed images and events
Video reconstruction using warped frames and sparse codes
Iterative Training Framework
Reconstruction loss
Flow loss
Numerical Results
Experimental Settings and Training Details
Datasets
Training details
...and 8 more sections

Figures (7)

Figure 1: Recursive CISTA-Flow architecture. (a) Original CISTA-LSTC network. (b) CISTA-Flow network. The DCEIFlow network leverages the previously reconstructed frame $\hat{\bm I}_{t-1}$ and event voxel grid ${\bm E}_{t-1}^t$ to estimate the forward flow $\hat{\bm F}_{t-1\rightarrow t}$. This estimated flow is then utilized to warp $\hat{\bm I}_{t-1}$ and the sparse codes ${\bm Z}_{t-1}^K$ obtained from the previous reconstruction. Finally, CISTA-LSTC employs these warped inputs to reconstruct the frame $\hat{\bm I}_{t}$. The initial $\hat{\bm I}_{0}$ is set to 0.
Figure 2: Reconstruction results with different warped frames. "Warp I" denotes the model with warped $\hat{\bm I}_{t-1}$ and "Warp I+Z" denotes the model with warped $\hat{\bm I}_{t-1}$ and ${\bm Z}_{t-1}^{K}$.
Figure 3: Separate training for (a) DCEIFlow using ground truth input frame and (b) CISTA-LSTC using ground truth flow, and additional training for (c) DCEIFlow (Rec I) using reconstructed frames generated by CISTA (GT Flow).
Figure 4: Comparison of video reconstruction between CISTA-Flow and other networks.
Figure 5: Reconstruction results of CISTA-LSTC and CISTA-Flow networks for scenes with multiple objects. The green boxes highlight sharper edges and finer details improved by the estimated flow, whereas the red boxes indicate areas with inaccuracies in reconstruction due to inaccurate flow.
...and 2 more figures

Enhanced Event-Based Video Reconstruction with Motion Compensation

TL;DR

Abstract

Enhanced Event-Based Video Reconstruction with Motion Compensation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)