Table of Contents
Fetching ...

Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing

Magnus Wrenninge, Jonas Unger

TL;DR

Synscapes addresses how photorealism and controlled parameter variation in synthetic street-scene data affect training, validation, and analysis of vision systems. It introduces a procedurally generated, 25k-image dataset with photorealistic rendering, Cityscapes-compatible annotations, and rich metadata, enabling de-correlated sampling and metadata-driven experiments. Across semantic segmentation and object detection, Synscapes demonstrates stronger transferability and more informative analyses than prior synthetic datasets, aided by high realism and comprehensive annotations. The work highlights realism as a key factor in synthetic-data utility and shows how detailed metadata can reveal performance biases and guide sensor-simulation improvements, with future work aimed at refining realism metrics and expanding analysis capabilities.

Abstract

We introduce Synscapes -- a synthetic dataset for street scene parsing created using photorealistic rendering techniques, and show state-of-the-art results for training and validation as well as new types of analysis. We study the behavior of networks trained on real data when performing inference on synthetic data: a key factor in determining the equivalence of simulation environments. We also compare the behavior of networks trained on synthetic data and evaluated on real-world data. Additionally, by analyzing pre-trained, existing segmentation and detection models, we illustrate how uncorrelated images along with a detailed set of annotations open up new avenues for analysis of computer vision systems, providing fine-grain information about how a model's performance changes according to factors such as distance, occlusion and relative object orientation.

Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing

TL;DR

Synscapes addresses how photorealism and controlled parameter variation in synthetic street-scene data affect training, validation, and analysis of vision systems. It introduces a procedurally generated, 25k-image dataset with photorealistic rendering, Cityscapes-compatible annotations, and rich metadata, enabling de-correlated sampling and metadata-driven experiments. Across semantic segmentation and object detection, Synscapes demonstrates stronger transferability and more informative analyses than prior synthetic datasets, aided by high realism and comprehensive annotations. The work highlights realism as a key factor in synthetic-data utility and shows how detailed metadata can reveal performance biases and guide sensor-simulation improvements, with future work aimed at refining realism metrics and expanding analysis capabilities.

Abstract

We introduce Synscapes -- a synthetic dataset for street scene parsing created using photorealistic rendering techniques, and show state-of-the-art results for training and validation as well as new types of analysis. We study the behavior of networks trained on real data when performing inference on synthetic data: a key factor in determining the equivalence of simulation environments. We also compare the behavior of networks trained on synthetic data and evaluated on real-world data. Additionally, by analyzing pre-trained, existing segmentation and detection models, we illustrate how uncorrelated images along with a detailed set of annotations open up new avenues for analysis of computer vision systems, providing fine-grain information about how a model's performance changes according to factors such as distance, occlusion and relative object orientation.

Paper Structure

This paper contains 27 sections, 1 equation, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Example image from Synscapes.
  • Figure 2: The scenario for each image in Synscapes is unique, and the dataset encompasses a wide variety of street scene configurations and environmental factors.
  • Figure 3: Overview of the procedural approach for scenario and image generation. Each image is defined by a scenario created from a specific sampling of the generating parameters. The benefit of the procedural approach is that it enables full control over the parameter sampling, scenario generation and rendering without manual work.
  • Figure 4: Synscape's broad distribution over scenario parameters allows for selection of subsets along several scenario parameters at once. Left column: overcast, right column: sunny. Top to bottom shows increasing number of cars.
  • Figure 5: Top row: Input images. Bottom row: predictions made by original authors' reference model of DeepLab v3+, trained on Cityscapes. Left to right: Synscapes, Richter (GTA) and Synthia. Table \ref{['table:validation_results']} shows corresponding results.
  • ...and 7 more figures