Table of Contents
Fetching ...

TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding

Duc Nguyen, Yan-Ling Lai, Qilin Zhang, Prabin Gyawali, Benedikt Schwab, Olaf Wysocki, Thomas H. Kolbe

TL;DR

TrueCity provides a first-of-its-kind benchmark with cm-accurate real-world LiDAR, a CityGML-aligned semantic city model, and synchronized synthetic LiDAR for the same urban area to enable direct quantification of the sim-to-real domain gap in 3D semantic segmentation. By evaluating a broad set of baselines across synthetic-real data mixtures, the work reveals a pronounced, class-dependent domain shift and shows that a balanced mix of synthetic and real data can enhance performance for transformer-based architectures, while locality-based methods rely more on real data. The dataset aligns with international standards to facilitate downstream integration into CityGML/OpenDRIVE workflows and reveals practical insights, such as certain large, regular classes exhibiting minimal domain gaps and several fine-grained classes requiring real data for robust segmentation. Limitations include the absence of radiometric information and dynamic objects, motivating future work to incorporate lighting/material effects and moving scene data to broaden sim-to-real gap analyses and generalizability.

Abstract

3D semantic scene understanding remains a long-standing challenge in the 3D computer vision community. One of the key issues pertains to limited real-world annotated data to facilitate generalizable models. The common practice to tackle this issue is to simulate new data. Although synthetic datasets offer scalability and perfect labels, their designer-crafted scenes fail to capture real-world complexity and sensor noise, resulting in a synthetic-to-real domain gap. Moreover, no benchmark provides synchronized real and simulated point clouds for segmentation-oriented domain shift analysis. We introduce TrueCity, the first urban semantic segmentation benchmark with cm-accurate annotated real-world point clouds, semantic 3D city models, and annotated simulated point clouds representing the same city. TrueCity proposes segmentation classes aligned with international 3D city modeling standards, enabling consistent evaluation of synthetic-to-real gap. Our extensive experiments on common baselines quantify domain shift and highlight strategies for exploiting synthetic data to enhance real-world 3D scene understanding. We are convinced that the TrueCity dataset will foster further development of sim-to-real gap quantification and enable generalizable data-driven models. The data, code, and 3D models are available online: https://tum-gis.github.io/TrueCity/

TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding

TL;DR

TrueCity provides a first-of-its-kind benchmark with cm-accurate real-world LiDAR, a CityGML-aligned semantic city model, and synchronized synthetic LiDAR for the same urban area to enable direct quantification of the sim-to-real domain gap in 3D semantic segmentation. By evaluating a broad set of baselines across synthetic-real data mixtures, the work reveals a pronounced, class-dependent domain shift and shows that a balanced mix of synthetic and real data can enhance performance for transformer-based architectures, while locality-based methods rely more on real data. The dataset aligns with international standards to facilitate downstream integration into CityGML/OpenDRIVE workflows and reveals practical insights, such as certain large, regular classes exhibiting minimal domain gaps and several fine-grained classes requiring real data for robust segmentation. Limitations include the absence of radiometric information and dynamic objects, motivating future work to incorporate lighting/material effects and moving scene data to broaden sim-to-real gap analyses and generalizability.

Abstract

3D semantic scene understanding remains a long-standing challenge in the 3D computer vision community. One of the key issues pertains to limited real-world annotated data to facilitate generalizable models. The common practice to tackle this issue is to simulate new data. Although synthetic datasets offer scalability and perfect labels, their designer-crafted scenes fail to capture real-world complexity and sensor noise, resulting in a synthetic-to-real domain gap. Moreover, no benchmark provides synchronized real and simulated point clouds for segmentation-oriented domain shift analysis. We introduce TrueCity, the first urban semantic segmentation benchmark with cm-accurate annotated real-world point clouds, semantic 3D city models, and annotated simulated point clouds representing the same city. TrueCity proposes segmentation classes aligned with international 3D city modeling standards, enabling consistent evaluation of synthetic-to-real gap. Our extensive experiments on common baselines quantify domain shift and highlight strategies for exploiting synthetic data to enhance real-world 3D scene understanding. We are convinced that the TrueCity dataset will foster further development of sim-to-real gap quantification and enable generalizable data-driven models. The data, code, and 3D models are available online: https://tum-gis.github.io/TrueCity/

Paper Structure

This paper contains 36 sections, 2 equations, 13 figures, 9 tables.

Figures (13)

  • Figure 1: TrueCity introduces real-world annotated point clouds, a semantic 3D city model, and 3D-model simulated point clouds of the same location, enabling coherent evaluation of the sim-to-real domain gap in 3D scene understanding.
  • Figure 2: Real-world point cloud (2nd row), which was manually labeled according to the class list of Table \ref{['tab:class-list']}, used for manual modeling of semantic 3D models (3rd row), which in turn were used to simulate and auto-label synthetic point clouds (4th row).
  • Figure 3: TrueCity represents the typical long-tail distribution challenge of real-world data.
  • Figure 4: Top-down schematic of S--R mixtures along a continuous streetscape. Solid lines mark train/validation/test splits; dashed lines mark boundaries between contiguous synthetic and real segments for each mixture ratio.
  • Figure 5: Qualitative impact of the synthetic–real (S--R) training mix on models from different methods (Point-based, Kernel-based and Transformer-based). We also present the ground truth synthetic and real point clouds; colors follow the TrueCity legend.
  • ...and 8 more figures