Table of Contents
Fetching ...

E-3DGS: Gaussian Splatting with Exposure and Motion Events

Xiaoting Yin, Hao Shi, Yuhan Bao, Zhenshan Bing, Yiyi Liao, Kailun Yang, Kaiwei Wang

TL;DR

E-3DGS tackles the problem of robust 3D reconstruction under challenging lighting and motion by leveraging a hardware-enabled transmittance adjustment device to capture both motion and exposure events with a single event camera. The authors integrate motion-event supervision with exposure-event–derived grayscale frames via Temporal-to-Intensity Mapping, and optimize 3D Gaussian Splatting (3DGS) under three modes: Fast, High-Quality, and Balanced Hybrid, achieving faster and purer reconstructions than prior event-based NeRF baselines. A key contribution is the EME-3D real-world dataset, which provides exposure and motion events with calibration and sparse point clouds to support research in this domain. Empirically, exposure events substantially improve fine-detail reconstruction on synthetic EventNeRF and real-world EME-3D data, while offering lower hardware requirements than multi-sensor fusion approaches, marking a significant step toward robust, single-sensor event-based 3D reconstruction.

Abstract

Achieving 3D reconstruction from images captured under optimal conditions has been extensively studied in the vision and imaging fields. However, in real-world scenarios, challenges such as motion blur and insufficient illumination often limit the performance of standard frame-based cameras in delivering high-quality images. To address these limitations, we incorporate a transmittance adjustment device at the hardware level, enabling event cameras to capture both motion and exposure events for diverse 3D reconstruction scenarios. Motion events (triggered by camera or object movement) are collected in fast-motion scenarios when the device is inactive, while exposure events (generated through controlled camera exposure) are captured during slower motion to reconstruct grayscale images for high-quality training and optimization of event-based 3D Gaussian Splatting (3DGS). Our framework supports three modes: High-Quality Reconstruction using exposure events, Fast Reconstruction relying on motion events, and Balanced Hybrid optimizing with initial exposure events followed by high-speed motion events. On the EventNeRF dataset, we demonstrate that exposure events significantly improve fine detail reconstruction compared to motion events and outperform frame-based cameras under challenging conditions such as low illumination and overexposure. Furthermore, we introduce EME-3D, a real-world 3D dataset with exposure events, motion events, camera calibration parameters, and sparse point clouds. Our method achieves faster and higher-quality reconstruction than event-based NeRF and is more cost-effective than methods combining event and RGB data. E-3DGS sets a new benchmark for event-based 3D reconstruction with robust performance in challenging conditions and lower hardware demands. The source code and dataset will be available at https://github.com/MasterHow/E-3DGS.

E-3DGS: Gaussian Splatting with Exposure and Motion Events

TL;DR

E-3DGS tackles the problem of robust 3D reconstruction under challenging lighting and motion by leveraging a hardware-enabled transmittance adjustment device to capture both motion and exposure events with a single event camera. The authors integrate motion-event supervision with exposure-event–derived grayscale frames via Temporal-to-Intensity Mapping, and optimize 3D Gaussian Splatting (3DGS) under three modes: Fast, High-Quality, and Balanced Hybrid, achieving faster and purer reconstructions than prior event-based NeRF baselines. A key contribution is the EME-3D real-world dataset, which provides exposure and motion events with calibration and sparse point clouds to support research in this domain. Empirically, exposure events substantially improve fine-detail reconstruction on synthetic EventNeRF and real-world EME-3D data, while offering lower hardware requirements than multi-sensor fusion approaches, marking a significant step toward robust, single-sensor event-based 3D reconstruction.

Abstract

Achieving 3D reconstruction from images captured under optimal conditions has been extensively studied in the vision and imaging fields. However, in real-world scenarios, challenges such as motion blur and insufficient illumination often limit the performance of standard frame-based cameras in delivering high-quality images. To address these limitations, we incorporate a transmittance adjustment device at the hardware level, enabling event cameras to capture both motion and exposure events for diverse 3D reconstruction scenarios. Motion events (triggered by camera or object movement) are collected in fast-motion scenarios when the device is inactive, while exposure events (generated through controlled camera exposure) are captured during slower motion to reconstruct grayscale images for high-quality training and optimization of event-based 3D Gaussian Splatting (3DGS). Our framework supports three modes: High-Quality Reconstruction using exposure events, Fast Reconstruction relying on motion events, and Balanced Hybrid optimizing with initial exposure events followed by high-speed motion events. On the EventNeRF dataset, we demonstrate that exposure events significantly improve fine detail reconstruction compared to motion events and outperform frame-based cameras under challenging conditions such as low illumination and overexposure. Furthermore, we introduce EME-3D, a real-world 3D dataset with exposure events, motion events, camera calibration parameters, and sparse point clouds. Our method achieves faster and higher-quality reconstruction than event-based NeRF and is more cost-effective than methods combining event and RGB data. E-3DGS sets a new benchmark for event-based 3D reconstruction with robust performance in challenging conditions and lower hardware demands. The source code and dataset will be available at https://github.com/MasterHow/E-3DGS.

Paper Structure

This paper contains 17 sections, 11 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Overview of the proposed E-3DGS framework. This framework integrates motion and exposure events for training 3DGS to effectively handle diverse real-world conditions. We utilize Temporal-to-Intensity Mapping to convert exposure events into intensity images, which yield camera trajectories and a sparse point cloud for 3DGS training. The optimization of 3DGS parameters is supervised through motion event loss and exposure event loss.
  • Figure 2: Real-world data acquisition setup: The object is placed on a motorized optical rotation stage and illuminated by an overhead ring light to ensure uniform lighting. The scene is captured using an AT-DVS, implemented with a Prophesee Evaluation Kit 4 HD (EVK4) and an aperture shutter, facilitating high dynamic range event-based grayscale imaging through temporal-to-intensity mapping by exposure events.
  • Figure 3: Qualitative comparison of appearance reconstruction on the real-world EME-3D dataset. The E2VID + NeRF method successfully reconstructs the overall scene but lacks fine local details, such as the tortoise's shell texture. EventNeRF exhibits noticeable artifacts in both the car and the tortoise. In contrast, our proposed E-3DGS, in both fast and high-quality reconstruction modes, preserves sharper and more consistent structures while maintaining cleaner backgrounds.
  • Figure 4: Qualitative comparison of geometry reconstruction on the real-world EME-3D dataset. Both E2VID+NeRF and EventNeRF, which are based on NeRF, struggle to separate the foreground from the background and are affected by noticeable noise. In contrast, the physics-based 3DGS method handles geometry reconstruction more effectively. Compared to E2VID+3DGS, our E-3DGS excels in capturing high-frequency spatial details, such as the gap between the crab’s body and the base.
  • Figure 5: Comparison of standard frame-based cameras and grayscale images generated from exposure events under extreme lighting conditions. The first column shows grayscale images derived from EventNeRF rudnev2023eventnerf ground truth RGB images with foreground masks extracted using EfficientSAM xiong2023efficientsam, serving as references under favorable lighting conditions. The second column presents simulated frame-based images under low light and overexposure. The third column presents grayscale images mapped from exposure events, derived from the frame-based images using the method described in Section \ref{['Experiments:Evaluation Dataset']}. These images exhibit improved performance in low-light conditions and demonstrate the capability to recover details lost in overexposed regions, leveraging the high dynamic range and temporal resolution of event cameras.
  • ...and 2 more figures