Table of Contents
Fetching ...

Synthetic data augmentation for robotic mobility aids to support blind and low vision people

Hochul Hwang, Krisha Adhikari, Satya Shodhaka, Donghyun Kim

TL;DR

This study investigates the effectiveness of synthetic data, generated using Unreal Engine 4, for training robust vision models for this safety-critical application and demonstrates that synthetic data can enhance model performance across multiple tasks.

Abstract

Robotic mobility aids for blind and low-vision (BLV) individuals rely heavily on deep learning-based vision models specialized for various navigational tasks. However, the performance of these models is often constrained by the availability and diversity of real-world datasets, which are challenging to collect in sufficient quantities for different tasks. In this study, we investigate the effectiveness of synthetic data, generated using Unreal Engine 4, for training robust vision models for this safety-critical application. Our findings demonstrate that synthetic data can enhance model performance across multiple tasks, showcasing both its potential and its limitations when compared to real-world data. We offer valuable insights into optimizing synthetic data generation for developing robotic mobility aids. Additionally, we publicly release our generated synthetic dataset to support ongoing research in assistive technologies for BLV individuals, available at https://hchlhwang.github.io/SToP.

Synthetic data augmentation for robotic mobility aids to support blind and low vision people

TL;DR

This study investigates the effectiveness of synthetic data, generated using Unreal Engine 4, for training robust vision models for this safety-critical application and demonstrates that synthetic data can enhance model performance across multiple tasks.

Abstract

Robotic mobility aids for blind and low-vision (BLV) individuals rely heavily on deep learning-based vision models specialized for various navigational tasks. However, the performance of these models is often constrained by the availability and diversity of real-world datasets, which are challenging to collect in sufficient quantities for different tasks. In this study, we investigate the effectiveness of synthetic data, generated using Unreal Engine 4, for training robust vision models for this safety-critical application. Our findings demonstrate that synthetic data can enhance model performance across multiple tasks, showcasing both its potential and its limitations when compared to real-world data. We offer valuable insights into optimizing synthetic data generation for developing robotic mobility aids. Additionally, we publicly release our generated synthetic dataset to support ongoing research in assistive technologies for BLV individuals, available at https://hchlhwang.github.io/SToP.
Paper Structure (15 sections, 4 figures, 2 tables)

This paper contains 15 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Synthetic data generation pipeline. We generated synthetic data using Unreal Engine 4 and the NVIDIA Deep Learning Dataset Synthesizer for various navigational downstream tasks.
  • Figure 2: Synthetic data generation environment. (a) The Suburban environment features urban roads and sidewalks that contain a variety of objects commonly found in sidewalk settings. (b) Controllable camera trajectories allow data collection from diverse viewpoints, reflecting the perspectives of different robotic mobility aids.
  • Figure 3: Samples from SToP Dataset. (a) Comparison between real data and generated synthetic data in various lighting and viewpoint settings, highlighting the close resemblance of synthetic data to real-world conditions. (b) Visualization of ground truth bounding boxes within the UE4 environment.
  • Figure 4: Tactile paving detection results. (a) YOLOv8 successfully detects tactile pavings from a top-down view, which were not detected by the pretrained model without synthetic data training. (b) The open-vocabulary YOLO-World provides bounding boxes for tactile pavings (right), a capability that was not achieved previously (left) on a publicly available dataset yu2019lytnetv2.