Table of Contents
Fetching ...

Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation

Seungyeop Lee, Knut Peterson, Solmaz Arezoomandan, Bill Cai, Peihan Li, Lifeng Zhou, David Han

TL;DR

The paper tackles the challenge of obtaining aligned depth data for monocular depth estimation by using synthetic data from Unreal Engine simulations and bridging the sim-to-real gap with CycleGAN domain transfer. It trains a DenseDepth-based depth estimator on CycleGAN-transformed synthetic images and validates performance using Husky robot data with LiDAR ground truth, showing that GAN-transferred data can rival real-world data. The key contributions are the synthetic data generation pipeline, the CycleGAN adaptation for domain transfer, and a comprehensive cross-domain evaluation across multiple environments. The results indicate that synthetic data, when domain-transferred effectively, offers cost-efficient, high-resolution depth supervision and robust generalization for monocular depth estimation in robotics.

Abstract

A major obstacle to the development of effective monocular depth estimation algorithms is the difficulty in obtaining high-quality depth data that corresponds to collected RGB images. Collecting this data is time-consuming and costly, and even data collected by modern sensors has limited range or resolution, and is subject to inconsistencies and noise. To combat this, we propose a method of data generation in simulation using 3D synthetic environments and CycleGAN domain transfer. We compare this method of data generation to the popular NYUDepth V2 dataset by training a depth estimation model based on the DenseDepth structure using different training sets of real and simulated data. We evaluate the performance of the models on newly collected images and LiDAR depth data from a Husky robot to verify the generalizability of the approach and show that GAN-transformed data can serve as an effective alternative to real-world data, particularly in depth estimation.

Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation

TL;DR

The paper tackles the challenge of obtaining aligned depth data for monocular depth estimation by using synthetic data from Unreal Engine simulations and bridging the sim-to-real gap with CycleGAN domain transfer. It trains a DenseDepth-based depth estimator on CycleGAN-transformed synthetic images and validates performance using Husky robot data with LiDAR ground truth, showing that GAN-transferred data can rival real-world data. The key contributions are the synthetic data generation pipeline, the CycleGAN adaptation for domain transfer, and a comprehensive cross-domain evaluation across multiple environments. The results indicate that synthetic data, when domain-transferred effectively, offers cost-efficient, high-resolution depth supervision and robust generalization for monocular depth estimation in robotics.

Abstract

A major obstacle to the development of effective monocular depth estimation algorithms is the difficulty in obtaining high-quality depth data that corresponds to collected RGB images. Collecting this data is time-consuming and costly, and even data collected by modern sensors has limited range or resolution, and is subject to inconsistencies and noise. To combat this, we propose a method of data generation in simulation using 3D synthetic environments and CycleGAN domain transfer. We compare this method of data generation to the popular NYUDepth V2 dataset by training a depth estimation model based on the DenseDepth structure using different training sets of real and simulated data. We evaluate the performance of the models on newly collected images and LiDAR depth data from a Husky robot to verify the generalizability of the approach and show that GAN-transformed data can serve as an effective alternative to real-world data, particularly in depth estimation.
Paper Structure (17 sections, 11 equations, 7 figures, 1 table)

This paper contains 17 sections, 11 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Illustration of (a) CycleGAN mapping functions, (b) forward cycle-consistency, and (c) backward cycle-consistency CycleGAN2017.
  • Figure 2: Synthetic images generated using Unreal Engine (first row) and their CycleGAN-transformed counterparts (second row). The CycleGAN transformation process alters features such as lighting and texture to increase the realism of a simulated image while maintaining the original features of the environment.
  • Figure 3: We employ Unreal Engine as a synthetic data generator and DenseDepth as a depth prediction model. The generated synthetic data are translated into images similar to real-world scenes through CycleGAN. The depth prediction model is trained with the translated data.
  • Figure 4: Examples of LiDAR and image data collected by the Husky robot. While the density of the collected LiDAR points is not enough to fully match a predicted depth image from the network, it is enough to effectively gauge the overall accuracy of depth predictions for a given image.
  • Figure 5: ClearPath Husky robot with ZED 2 camera and Velodyne VLP-32 LiDAR unit.
  • ...and 2 more figures