LiDAR-Guided Monocular 3D Object Detection for Long-Range Railway Monitoring
Raul David Dominguez Sanchez, Xavier Diaz Ortiz, Xingcheng Zhou, Max Peter Ronecker, Michael Karner, Daniel Watzenig, Alois Knoll
TL;DR
Problem: achieving robust long-range 3D object detection for autonomous trains using only monocular images. Approach: a LiDAR-guided training pipeline (Monocular Faraway-Frustum, MFF) with a depth estimation module, a 2.5D detection head, a frustum-based decision block, and dedicated long- and short-range 3D heads. Contributions: monocular-only railway perception with LiDAR-informed depth guidance, a four-module architecture, and detailed evaluation on OSDaR23 showing detections up to 250 m and competitive baselines. Findings/Significance: demonstrates viable long-range perception for railway automation while identifying depth accuracy and data distribution as key limiting factors, pointing to future work in real-time deployment, end-to-end training, and synthetic data augmentation.
Abstract
Railway systems, particularly in Germany, require high levels of automation to address legacy infrastructure challenges and increase train traffic safely. A key component of automation is robust long-range perception, essential for early hazard detection, such as obstacles at level crossings or pedestrians on tracks. Unlike automotive systems with braking distances of ~70 meters, trains require perception ranges exceeding 1 km. This paper presents an deep-learning-based approach for long-range 3D object detection tailored for autonomous trains. The method relies solely on monocular images, inspired by the Faraway-Frustum approach, and incorporates LiDAR data during training to improve depth estimation. The proposed pipeline consists of four key modules: (1) a modified YOLOv9 for 2.5D object detection, (2) a depth estimation network, and (3-4) dedicated short- and long-range 3D detection heads. Evaluations on the OSDaR23 dataset demonstrate the effectiveness of the approach in detecting objects up to 250 meters. Results highlight its potential for railway automation and outline areas for future improvement.
