Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition
Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Masoud Roschani, Jens Ziehn
TL;DR
The paper introduces Synset Signset Germany, a large synthetic dataset for German traffic sign recognition generated via a hybrid pipeline that combines GAN-based texture synthesis with analytic scene modulation. The 105,500-image dataset spans 211 classes and includes rich per-image metadata, segmentation masks, and a subset aligned to GTSRB. Across in-domain and cross-domain evaluations, models trained on the synthetic data achieve strong performance, with notable robustness and explainability analyses enabled by the dataset's controllable parameters. The work demonstrates the value of scalable, explainable synthetic data for traffic sign recognition while outlining limitations and avenues for extending the approach to international signs and enhanced realism.
Abstract
In this paper, we present a synthesis pipeline and dataset for training / testing data in the task of traffic sign recognition that combines the advantages of data-driven and analytical modeling: GAN-based texture generation enables data-driven dirt and wear artifacts, rendering unique and realistic traffic sign surfaces, while the analytical scene modulation achieves physically correct lighting and allows detailed parameterization. In particular, the latter opens up applications in the context of explainable AI (XAI) and robustness tests due to the possibility of evaluating the sensitivity to parameter changes, which we demonstrate with experiments. Our resulting synthetic traffic sign recognition dataset Synset Signset Germany contains a total of 105500 images of 211 different German traffic sign classes, including newly published (2020) and thus comparatively rare traffic signs. In addition to a mask and a segmentation image, we also provide extensive metadata including the stochastically selected environment and imaging effect parameters for each image. We evaluate the degree of realism of Synset Signset Germany on the real-world German Traffic Sign Recognition Benchmark (GTSRB) and in comparison to CATERED, a state-of-the-art synthetic traffic sign recognition dataset.
