Table of Contents
Fetching ...

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures

Yichao Zhou, Jingwei Huang, Xili Dai, Shichen Liu, Linjie Luo, Zhili Chen, Yi Ma

TL;DR

HoliCity presents a city-scale, CAD-aligned 3D data platform that pairs high-resolution panoramas with accurate CAD models of downtown London to enable learning of holistic 3D structures such as corners, lines, planes, and wireframes. By leveraging CAD-grounded ground-truths, HoliCity supports both high-level geometry tasks and traditional depth/normal estimation, enabling robust urban reconstruction, localization, and AR-ready representations. The authors detail a scalable pipeline for data collection, precise panorama–CAD alignment, and an annotation framework, and demonstrate the utility of HoliCity through tasks like surface segmentation, normal estimation, vanishing-point detection, and monocular depth, highlighting its generalizability across outdoor and synthetic datasets. The work argues that a CAD-model-based outdoor dataset is crucial for reliable high-level 3D vision in cities and provides a practical, scalable resource for research and application development in urban environments.

Abstract

We present HoliCity, a city-scale 3D dataset with rich structural information. Currently, this dataset has 6,300 real-world panoramas of resolution $13312 \times 6656$ that are accurately aligned with the CAD model of downtown London with an area of more than 20 km$^2$, in which the median reprojection error of the alignment of an average image is less than half a degree. This dataset aims to be an all-in-one data platform for research of learning abstracted high-level holistic 3D structures that can be derived from city CAD models, e.g., corners, lines, wireframes, planes, and cuboids, with the ultimate goal of supporting real-world applications including city-scale reconstruction, localization, mapping, and augmented reality. The accurate alignment of the 3D CAD models and panoramas also benefits low-level 3D vision tasks such as surface normal estimation, as the surface normal extracted from previous LiDAR-based datasets is often noisy. We conduct experiments to demonstrate the applications of HoliCity, such as predicting surface segmentation, normal maps, depth maps, and vanishing points, as well as test the generalizability of methods trained on HoliCity and other related datasets. HoliCity is available at https://holicity.io.

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures

TL;DR

HoliCity presents a city-scale, CAD-aligned 3D data platform that pairs high-resolution panoramas with accurate CAD models of downtown London to enable learning of holistic 3D structures such as corners, lines, planes, and wireframes. By leveraging CAD-grounded ground-truths, HoliCity supports both high-level geometry tasks and traditional depth/normal estimation, enabling robust urban reconstruction, localization, and AR-ready representations. The authors detail a scalable pipeline for data collection, precise panorama–CAD alignment, and an annotation framework, and demonstrate the utility of HoliCity through tasks like surface segmentation, normal estimation, vanishing-point detection, and monocular depth, highlighting its generalizability across outdoor and synthetic datasets. The work argues that a CAD-model-based outdoor dataset is crucial for reliable high-level 3D vision in cities and provides a practical, scalable resource for research and application development in urban environments.

Abstract

We present HoliCity, a city-scale 3D dataset with rich structural information. Currently, this dataset has 6,300 real-world panoramas of resolution that are accurately aligned with the CAD model of downtown London with an area of more than 20 km, in which the median reprojection error of the alignment of an average image is less than half a degree. This dataset aims to be an all-in-one data platform for research of learning abstracted high-level holistic 3D structures that can be derived from city CAD models, e.g., corners, lines, wireframes, planes, and cuboids, with the ultimate goal of supporting real-world applications including city-scale reconstruction, localization, mapping, and augmented reality. The accurate alignment of the 3D CAD models and panoramas also benefits low-level 3D vision tasks such as surface normal estimation, as the surface normal extracted from previous LiDAR-based datasets is often noisy. We conduct experiments to demonstrate the applications of HoliCity, such as predicting surface segmentation, normal maps, depth maps, and vanishing points, as well as test the generalizability of methods trained on HoliCity and other related datasets. HoliCity is available at https://holicity.io.

Paper Structure

This paper contains 36 sections, 3 equations, 12 figures, 4 tables.

Figures (12)

  • Figure 1: Our HoliCity dataset consists of accurate city-scale CAD models and spatially-registered street view panoramas. HoliCity covers an area of more than 20 km$^2$ in London from 6,300 viewpoints, which dwarfs previous datasets such as Oxford RobotCar maddern20171 (\ref{['fig:teaser:satellite']}). From the CAD models (\ref{['fig:teaser:CAD']}) and the panoramas (\ref{['fig:teaser:pano']}), it is possible to generate clean structured ground-truths for 3D understanding tasks, including perspective RGB images (\ref{['fig:teaser:image']}), surface segments, and normal maps (\ref{['fig:teaser:renderings']}).
  • Figure 2: Images and generated 3D information from sampled viewpoints of HoliCity dataset. From top to bottom: perspective images rendered from panoramas, surface segments overlaid with images, CAD model renderings, and semantic segmentation.
  • Figure 3: Statistics of HoliCity. We show the number of annotations per panorama for registration (\ref{['fig:statistics:count']}); the reprojection errors of annotated 3D points on panoramas (\ref{['fig:statistics:reprojection']}) and the occurrence of planes on different panoramas (\ref{['fig:statistics:viewers']}).
  • Figure 4: Qualitative results of models evaluated on HoliCity. We test models of MaskRCNN he2017mask, Associative Embedding yu2019single, PlaneRecover yang2018recovering, and UNet ronneberger2015u that are trained on HoliCity, ScanNet dai2017scannet, and SYNTHIA ros2016synthia on HoliCity.
  • Figure 5: Qualitative results of models evaluated on images from the MegaDepth dataset zhengqi2018megadepth. We test models of MaskRCNN he2017mask, Associative Embedding yu2019single and UNet ronneberger2015u trained on HoliCity, ScanNet, and SYNTHIA. Models are NOT fine-tuned on MegaDepth.
  • ...and 7 more figures