Table of Contents
Fetching ...

From Gaming to Research: GTA V for Synthetic Data Generation for Robotics and Navigations

Matteo Scucchia, Matteo Ferrara, Davide Maltoni

TL;DR

This work addresses the data scarcity and cost barriers in robotics vision by introducing GTA V as a source of synthetic RGB-D data for SLAM and Visual Place Recognition (VPR). It presents an end-to-end pipeline to capture GTA V data and a VPR dataset-generation algorithm that operates without human supervision. Through VPR and SLAM experiments, the authors demonstrate that GTA V data can substitute for real data in many scenarios and can boost performance when combined with real data, achieving competitive recall in VPR and accurate RGB-D trajectories in SLAM. The approach enables scalable, low-cost creation of large synthetic datasets, paving the way for lifelong SLAM research and broad benchmarking in robotics vision.

Abstract

In computer vision, the development of robust algorithms capable of generalizing effectively in real-world scenarios more and more often requires large-scale datasets collected under diverse environmental conditions. However, acquiring such datasets is time-consuming, costly, and sometimes unfeasible. To address these limitations, the use of synthetic data has gained attention as a viable alternative, allowing researchers to generate vast amounts of data while simulating various environmental contexts in a controlled setting. In this study, we investigate the use of synthetic data in robotics and navigation, specifically focusing on Simultaneous Localization and Mapping (SLAM) and Visual Place Recognition (VPR). In particular, we introduce a synthetic dataset created using the virtual environment of the video game Grand Theft Auto V (GTA V), along with an algorithm designed to generate a VPR dataset, without human supervision. Through a series of experiments centered on SLAM and VPR, we demonstrate that synthetic data derived from GTA V are qualitatively comparable to real-world data. Furthermore, these synthetic data can complement or even substitute real-world data in these applications. This study sets the stage for the creation of large-scale synthetic datasets, offering a cost-effective and scalable solution for future research and development.

From Gaming to Research: GTA V for Synthetic Data Generation for Robotics and Navigations

TL;DR

This work addresses the data scarcity and cost barriers in robotics vision by introducing GTA V as a source of synthetic RGB-D data for SLAM and Visual Place Recognition (VPR). It presents an end-to-end pipeline to capture GTA V data and a VPR dataset-generation algorithm that operates without human supervision. Through VPR and SLAM experiments, the authors demonstrate that GTA V data can substitute for real data in many scenarios and can boost performance when combined with real data, achieving competitive recall in VPR and accurate RGB-D trajectories in SLAM. The approach enables scalable, low-cost creation of large synthetic datasets, paving the way for lifelong SLAM research and broad benchmarking in robotics vision.

Abstract

In computer vision, the development of robust algorithms capable of generalizing effectively in real-world scenarios more and more often requires large-scale datasets collected under diverse environmental conditions. However, acquiring such datasets is time-consuming, costly, and sometimes unfeasible. To address these limitations, the use of synthetic data has gained attention as a viable alternative, allowing researchers to generate vast amounts of data while simulating various environmental contexts in a controlled setting. In this study, we investigate the use of synthetic data in robotics and navigation, specifically focusing on Simultaneous Localization and Mapping (SLAM) and Visual Place Recognition (VPR). In particular, we introduce a synthetic dataset created using the virtual environment of the video game Grand Theft Auto V (GTA V), along with an algorithm designed to generate a VPR dataset, without human supervision. Through a series of experiments centered on SLAM and VPR, we demonstrate that synthetic data derived from GTA V are qualitatively comparable to real-world data. Furthermore, these synthetic data can complement or even substitute real-world data in these applications. This study sets the stage for the creation of large-scale synthetic datasets, offering a cost-effective and scalable solution for future research and development.

Paper Structure

This paper contains 15 sections, 7 equations, 12 figures, 5 tables, 2 algorithms.

Figures (12)

  • Figure 1: On the left, (a) shows a sparse depth map, while on the right, (b) is the corresponding dense depth.
  • Figure 2: Three-quarter view of a point cloud acquired through a 3D laser from the KITTI dataset kitti. Color fades from blue to red, indicating greater proximity to the sensor at blue and greater distance at red.
  • Figure 3: Image with bounding boxes from the PreSIL dataset presil, acquired from GTA V. The blue rectangle are the bounding boxes for the vehicles, while the red ones are for the pedestrians.
  • Figure 4: On the left, (a) shows an RGB image, while on the right (b) shows the corresponding segmented image. The images are from the GTA5 dataset gta_seg, acquired from GTA V. Each pixel is colored according to the category of the object to which it belongs.
  • Figure 5: Data acquisition process.
  • ...and 7 more figures