Table of Contents
Fetching ...

OSDaR-AR: Enhancing Railway Perception Datasets via Multi-modal Augmented Reality

Federico Nesti, Gianluca D'Amico, Mauro Marinoni, Giorgio Buttazzo

TL;DR

A multi-modal augmented reality framework designed to bridge the gap between photorealistic virtual objects into real-world railway sequences from the OSDaR23 dataset and a segmentation-based refinement strategy for INS/GNSS data to significantly improve the realism of the augmented sequences are introduced.

Abstract

Although deep learning has significantly advanced the perception capabilities of intelligent transportation systems, railway applications continue to suffer from a scarcity of high-quality, annotated data for safety-critical tasks like obstacle detection. While photorealistic simulators offer a solution, they often struggle with the ``sim-to-real" gap; conversely, simple image-masking techniques lack the spatio-temporal coherence required to obtain augmented single- and multi-frame scenes with the correct appearance and dimensions. This paper introduces a multi-modal augmented reality framework designed to bridge this gap by integrating photorealistic virtual objects into real-world railway sequences from the OSDaR23 dataset. Utilizing Unreal Engine 5 features, our pipeline leverages LiDAR point-clouds and INS/GNSS data to ensure accurate object placement and temporal stability across RGB frames. This paper also proposes a segmentation-based refinement strategy for INS/GNSS data to significantly improve the realism of the augmented sequences, as confirmed by the comparative study presented in the paper. Carefully designed augmented sequences are collected to produce OSDaR-AR, a public dataset designed to support the development of next-generation railway perception systems. The dataset is available at the following page: https://syndra.retis.santannapisa.it/osdarar.html

OSDaR-AR: Enhancing Railway Perception Datasets via Multi-modal Augmented Reality

TL;DR

A multi-modal augmented reality framework designed to bridge the gap between photorealistic virtual objects into real-world railway sequences from the OSDaR23 dataset and a segmentation-based refinement strategy for INS/GNSS data to significantly improve the realism of the augmented sequences are introduced.

Abstract

Although deep learning has significantly advanced the perception capabilities of intelligent transportation systems, railway applications continue to suffer from a scarcity of high-quality, annotated data for safety-critical tasks like obstacle detection. While photorealistic simulators offer a solution, they often struggle with the ``sim-to-real" gap; conversely, simple image-masking techniques lack the spatio-temporal coherence required to obtain augmented single- and multi-frame scenes with the correct appearance and dimensions. This paper introduces a multi-modal augmented reality framework designed to bridge this gap by integrating photorealistic virtual objects into real-world railway sequences from the OSDaR23 dataset. Utilizing Unreal Engine 5 features, our pipeline leverages LiDAR point-clouds and INS/GNSS data to ensure accurate object placement and temporal stability across RGB frames. This paper also proposes a segmentation-based refinement strategy for INS/GNSS data to significantly improve the realism of the augmented sequences, as confirmed by the comparative study presented in the paper. Carefully designed augmented sequences are collected to produce OSDaR-AR, a public dataset designed to support the development of next-generation railway perception systems. The dataset is available at the following page: https://syndra.retis.santannapisa.it/osdarar.html
Paper Structure (11 sections, 3 figures, 1 table)

This paper contains 11 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Three samples from the OSDaR-AR dataset: (a) a frame from sequence 3_fire_site_3.1 (rgb_highres_center camera) augmented with a virtual animal; (b) a point-cloud from sequence 6_station_klein_flottbek_6.2 augmented with a virtual person; (c) a frame from sequence 5_station_bergedorf_5.1 (rgb_center camera) augmented with a virtual boulder. Best viewed in digital version.
  • Figure 2: Preparation and rendering pipeline to obtain OSDaR-AR sequences from the original OSDaR data. The images also show, from top left to bottom left clock-wise, a sample RGB image from sequence 6_station_klein_flottbek_6.2 (rgb_highres_center camera), the segmented point-cloud, the minimal reconstructed virtual environment, and the final rendered frame (cropped to better show details).
  • Figure 3: A comparison between different localization methods for sequence 6_station_klein_flottbek_6.2: the INS/GNSS deviates from the rail geometry, the LiDAR Odometry is quite accurate, while the segmentation-based projection refinement of the INS/GNSS improves the original position.