Table of Contents
Fetching ...

EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting

Jiaxu Wang, Junhao He, Ziyi Zhang, Mingyuan Sun, Jingkai Sun, Renjing Xu

TL;DR

This work proposes the first event-based generalizable 3D reconstruction framework, called EvGGS, which reconstructs scenes as 3D Gaussians from only event input in a feedforward manner and can generalize to unseen cases without any retraining.

Abstract

Event cameras offer promising advantages such as high dynamic range and low latency, making them well-suited for challenging lighting conditions and fast-moving scenarios. However, reconstructing 3D scenes from raw event streams is difficult because event data is sparse and does not carry absolute color information. To release its potential in 3D reconstruction, we propose the first event-based generalizable 3D reconstruction framework, called EvGGS, which reconstructs scenes as 3D Gaussians from only event input in a feedforward manner and can generalize to unseen cases without any retraining. This framework includes a depth estimation module, an intensity reconstruction module, and a Gaussian regression module. These submodules connect in a cascading manner, and we collaboratively train them with a designed joint loss to make them mutually promote. To facilitate related studies, we build a novel event-based 3D dataset with various material objects and calibrated labels of grayscale images, depth maps, camera poses, and silhouettes. Experiments show models that have jointly trained significantly outperform those trained individually. Our approach performs better than all baselines in reconstruction quality, and depth/intensity predictions with satisfactory rendering speed.

EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting

TL;DR

This work proposes the first event-based generalizable 3D reconstruction framework, called EvGGS, which reconstructs scenes as 3D Gaussians from only event input in a feedforward manner and can generalize to unseen cases without any retraining.

Abstract

Event cameras offer promising advantages such as high dynamic range and low latency, making them well-suited for challenging lighting conditions and fast-moving scenarios. However, reconstructing 3D scenes from raw event streams is difficult because event data is sparse and does not carry absolute color information. To release its potential in 3D reconstruction, we propose the first event-based generalizable 3D reconstruction framework, called EvGGS, which reconstructs scenes as 3D Gaussians from only event input in a feedforward manner and can generalize to unseen cases without any retraining. This framework includes a depth estimation module, an intensity reconstruction module, and a Gaussian regression module. These submodules connect in a cascading manner, and we collaboratively train them with a designed joint loss to make them mutually promote. To facilitate related studies, we build a novel event-based 3D dataset with various material objects and calibrated labels of grayscale images, depth maps, camera poses, and silhouettes. Experiments show models that have jointly trained significantly outperform those trained individually. Our approach performs better than all baselines in reconstruction quality, and depth/intensity predictions with satisfactory rendering speed.
Paper Structure (34 sections, 16 equations, 12 figures, 8 tables)

This paper contains 34 sections, 16 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Overview of EvGGS. Given a 360-degree event stream and target viewpoints. we select two segments of event spatial-temporal voxels from consecutive moments as inputs. For each source view, we employ two submodules to extract the depth and intensity information, which serve as the 3D position and color maps. Another module aims to infer other 3D Gaussian parameters. The feature and output of the three modules are hierarchically bridged, facilitating a smooth backpropagation through joint training.
  • Figure 2: Qualitative comparison of ours and other event-based 3D methods in novel view synthesis.
  • Figure 3: Qualitative comparison of ours and other intensity reconstruction methods.
  • Figure 4: Qualitative comparison of ours and other depth estimation methods.
  • Figure 5: Qualitative comparisons on realistic event dataset.
  • ...and 7 more figures