Towards End-to-End Neuromorphic Event-based 3D Object Reconstruction Without Physical Priors
Chuanzhi Xu, Langyi Chen, Haodong Chen, Vera Chung, Qiang Qu
TL;DR
The paper tackles monocular neuromorphic 3D reconstruction without physical priors by proposing an end-to-end dense voxel framework. It combines a Sobel-based event representation (Sobel Event Frame) with an Efficient Channel Attention–enhanced 3D ResNet to learn edge-focused features from event streams, plus a principled, representation-dependent binarization threshold selection. On SynthEVox3D, the approach yields a 54.6% improvement in mIoU over the baseline E2V and approaches traditional multi-view results, demonstrating robust performance under rapid motion. These contributions reduce reliance on priors and pipelines, enabling more scalable and resilient 3D reconstruction from neuromorphic cameras in challenging environments.
Abstract
Neuromorphic cameras, also known as event cameras, are asynchronous brightness-change sensors that can capture extremely fast motion without suffering from motion blur, making them particularly promising for 3D reconstruction in extreme environments. However, existing research on 3D reconstruction using monocular neuromorphic cameras is limited, and most of the methods rely on estimating physical priors and employ complex multi-step pipelines. In this work, we propose an end-to-end method for dense voxel 3D reconstruction using neuromorphic cameras that eliminates the need to estimate physical priors. Our method incorporates a novel event representation to enhance edge features, enabling the proposed feature-enhancement model to learn more effectively. Additionally, we introduced Optimal Binarization Threshold Selection Principle as a guideline for future related work, using the optimal reconstruction results achieved with threshold optimization as the benchmark. Our method achieves a 54.6% improvement in reconstruction accuracy compared to the baseline method.
