Temporal Lidar Depth Completion
Pietari Kaskela, Philipp Fischer, Timo Roman
TL;DR
The paper tackles depth completion for lidar data by fusing sparse lidar measurements with camera imagery and introducing temporal recurrence to reuse information from prior frames. It builds on the PENet architecture, adapting it to incorporate recurrence while maintaining minimal additional compute. Key contributions include achieving state-of-the-art results on the KITTI depth completion dataset with under 1% extra parameters and FLOPs, and delivering substantial improvements for faraway objects and sparse-depth regions, as well as notable gains in areas lacking ground-truth, such as the sky and rooftops. This approach enhances depth perception for autonomous driving while preserving efficiency, enabling better perception in challenging scenarios without heavy compute burden.
Abstract
Given the lidar measurements from an autonomous vehicle, we can project the points and generate a sparse depth image. Depth completion aims at increasing the resolution of such a depth image by infilling and interpolating the sparse depth values. Like most existing approaches, we make use of camera images as guidance in very sparse or occluded regions. In addition, we propose a temporal algorithm that utilizes information from previous timesteps using recurrence. In this work, we show how a state-of-the-art method PENet can be modified to benefit from recurrency. Our algorithm achieves state-of-the-art results on the KITTI depth completion dataset while adding only less than one percent of additional overhead in terms of both neural network parameters and floating point operations. The accuracy is especially improved for faraway objects and regions containing a low amount of lidar depth samples. Even in regions without any ground truth (like sky and rooftops) we observe large improvements which are not captured by the existing evaluation metrics.
