Geometric-Photometric Event-based 3D Gaussian Ray Tracing
Kai Kohyama, Yoshimitsu Aoki, Guillermo Gallego, Shintaro Shiba
TL;DR
This work addresses the challenge of exploiting the high temporal resolution of event cameras for 3D Gaussian Splatting by decoupling rendering into two branches: event-by-event depth (geometry) and snapshot-based radiance (appearance). It introduces a differentiable, ray-traced event-based GS framework connected by the image of warped events, employing a geometric (Contrast Maximization) and photometric loss with an initialization that does not rely on pretrained models or COLMAP. The method demonstrates state-of-the-art performance on real-world datasets and competitive results on synthetic data, with significantly faster training times and robustness to the number of events used. It offers a practical path toward high-temporal-resolution 3D reconstruction from sparse event data without external priors.
Abstract
Event cameras offer a high temporal resolution over traditional frame-based cameras, which makes them suitable for motion and structure estimation. However, it has been unclear how event-based 3D Gaussian Splatting (3DGS) approaches could leverage fine-grained temporal information of sparse events. This work proposes a framework to address the trade-off between accuracy and temporal resolution in event-based 3DGS. Our key idea is to decouple the rendering into two branches: event-by-event geometry (depth) rendering and snapshot-based radiance (intensity) rendering, by using ray-tracing and the image of warped events. The extensive evaluation shows that our method achieves state-of-the-art performance on the real-world datasets and competitive performance on the synthetic dataset. Also, the proposed method works without prior information (e.g., pretrained image reconstruction models) or COLMAP-based initialization, is more flexible in the event selection number, and achieves sharp reconstruction on scene edges with fast training time. We hope that this work deepens our understanding of the sparse nature of events for 3D reconstruction. The code will be released.
