A Style-Based Profiling Framework for Quantifying the Synthetic-to-Real Gap in Autonomous Driving Datasets
Dingyi Yao, Xinyao Han, Ruibo Ming, Zhihang Song, Lihui Peng, Jianming Hu, Danya Yao, Yi Zhang
TL;DR
This work tackles the synthetic-to-real gap in autonomous driving by introducing a style-based profiling framework that separates image content from style using Gram-matrix representations. The Style Embedding Distribution Discrepancy (SEDD) metric combines a shallow-layer feature extractor, Gram-based style embeddings, and metric learning with Center Loss and NTXent, producing two gap measures $SEDD_1$ and $SEDD_2$\text{ for robust, distribution-aware quantification}. Through a public benchmark with real and synthetic datasets (e.g., KITTI, Cityscapes, VKITTI1/2) and sim-to-real methods, the authors demonstrate that SEDD can distinguish realism across datasets, outperform No-Reference IQA baselines, and track improvements from photorealism enhancements. The framework operates as a standardized quality-control tool for synthetic data, enabling targeted augmentation and refinement to better support data-driven autonomous driving systems. This approach advances data-centric evaluation by providing objective, interpretable metrics for synthetic-to-real transfer and generalization.
Abstract
Ensuring the reliability of autonomous driving perception systems requires extensive environment-based testing, yet real-world execution is often impractical. Synthetic datasets have therefore emerged as a promising alternative, offering advantages such as cost-effectiveness, bias free labeling, and controllable scenarios. However, the domain gap between synthetic and real-world datasets remains a major obstacle to model generalization. To address this challenge from a data-centric perspective, this paper introduces a profile extraction and discovery framework for characterizing the style profiles underlying both synthetic and real image datasets. We propose Style Embedding Distribution Discrepancy (SEDD) as a novel evaluation metric. Our framework combines Gram matrix-based style extraction with metric learning optimized for intra-class compactness and inter-class separation to extract style embeddings. Furthermore, we establish a benchmark using publicly available datasets. Experiments are conducted on a variety of datasets and sim-to-real methods, and the results show that our method is capable of quantifying the synthetic-to-real gap. This work provides a standardized profiling-based quality control paradigm that enables systematic diagnosis and targeted enhancement of synthetic datasets, advancing future development of data-driven autonomous driving systems.
