Table of Contents
Fetching ...

WinSyn: A High Resolution Testbed for Synthetic Data

Tom Kelly, John Femiani, Peter Wonka

TL;DR

WinSyn tackles the synthetic-to-real data problem by presenting a high-resolution real-world window dataset and a procedural modeling testbed. It trains semantic segmentation models on synthetic data and evaluates on real-world windows to quantify domain gaps, reporting a global mIoU of about 31.2 for synthetic training versus an approximate 53.8 upper bound from real data. The findings show that, despite a detailed, high-variance procedural model and large synthetic datasets, the domain gap remains substantial and is not bridged by dataset size or modest architectural tweaks. The work provides a practical, scalable platform for rapid iteration in procedural graphics and synthetic data generation, with potential applications in depth, material properties, and lighting-aware reconstruction.

Abstract

We present WinSyn, a unique dataset and testbed for creating high-quality synthetic data with procedural modeling techniques. The dataset contains high-resolution photographs of windows, selected from locations around the world, with 89,318 individual window crops showcasing diverse geometric and material characteristics. We evaluate a procedural model by training semantic segmentation networks on both synthetic and real images and then comparing their performances on a shared test set of real images. Specifically, we measure the difference in mean Intersection over Union (mIoU) and determine the effective number of real images to match synthetic data's training performance. We design a baseline procedural model as a benchmark and provide 21,290 synthetically generated images. By tuning the procedural model, key factors are identified which significantly influence the model's fidelity in replicating real-world scenarios. Importantly, we highlight the challenge of procedural modeling using current techniques, especially in their ability to replicate the spatial semantics of real-world scenarios. This insight is critical because of the potential of procedural models to bridge to hidden scene aspects such as depth, reflectivity, material properties, and lighting conditions.

WinSyn: A High Resolution Testbed for Synthetic Data

TL;DR

WinSyn tackles the synthetic-to-real data problem by presenting a high-resolution real-world window dataset and a procedural modeling testbed. It trains semantic segmentation models on synthetic data and evaluates on real-world windows to quantify domain gaps, reporting a global mIoU of about 31.2 for synthetic training versus an approximate 53.8 upper bound from real data. The findings show that, despite a detailed, high-variance procedural model and large synthetic datasets, the domain gap remains substantial and is not bridged by dataset size or modest architectural tweaks. The work provides a practical, scalable platform for rapid iteration in procedural graphics and synthetic data generation, with potential applications in depth, material properties, and lighting-aware reconstruction.

Abstract

We present WinSyn, a unique dataset and testbed for creating high-quality synthetic data with procedural modeling techniques. The dataset contains high-resolution photographs of windows, selected from locations around the world, with 89,318 individual window crops showcasing diverse geometric and material characteristics. We evaluate a procedural model by training semantic segmentation networks on both synthetic and real images and then comparing their performances on a shared test set of real images. Specifically, we measure the difference in mean Intersection over Union (mIoU) and determine the effective number of real images to match synthetic data's training performance. We design a baseline procedural model as a benchmark and provide 21,290 synthetically generated images. By tuning the procedural model, key factors are identified which significantly influence the model's fidelity in replicating real-world scenarios. Importantly, we highlight the challenge of procedural modeling using current techniques, especially in their ability to replicate the spatial semantics of real-world scenarios. This insight is critical because of the potential of procedural models to bridge to hidden scene aspects such as depth, reflectivity, material properties, and lighting conditions.
Paper Structure (10 sections, 7 figures, 2 tables)

This paper contains 10 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Photographers in 28 geographic regions captured real-world photos (a) of windows that are cropped (b) to single windows, which are then labeled (c). Synthetic windows are rendered giving color (f) and labels(d), while other passes such as depth (e) are also possible.
  • Figure 2: Samples from the 75,739 photographs in the dataset. Each column shows a variety of examples of windows from different geographic locations. From left to right: Chicago (USA), Cambridge (UK), Bangkok (Thailand), Cairo (Egypt), and Vienna (Austria). The dataset has a variety of window shapes and architectural styles.
  • Figure 3: Examples of the labels used to annotate our data. Each instance receives its own polygon. The reader may wish to zoom into the figure for details.
  • Figure 4: Left: histogram of per-image mIoUs showing the distribution of labeling results for a model trained on $n=$ 2,048 synthetic (s) and real (r) images. We also show the difference between the model mIoU's per image (r-s). The mIoU was evaluated on 4,906 real images. Right: random samples of the labeling quality for networks trained on real and synthetic data; the first sample with an mIoU above each decile was selected.
  • Figure 5: The effect of varying real-world dataset sizes on mIoU with (blue) and without (red) an additional 2,048 synthetic samples. The green area bounds the range in which adding real data neither harms nor improves performance; the right-most point of which has $n=152$ with an mIoU of $44.96$. At larger datasets, synthetic data slightly reduces mIoU relative to only using real-world data.
  • ...and 2 more figures