Streaming CityJSON datasets

Hugo Ledoux; Gina Stavropoulou; Balázs Dukai

Streaming CityJSON datasets

Hugo Ledoux, Gina Stavropoulou, Balázs Dukai

TL;DR

Problem: streaming very large 3D city models encoded in CityJSON is challenging due to global vertex indexing. Approach: CityJSONSeq decomposes datasets into per-feature CityJSONFeature objects, serialized as NDJSON with a CityJSON header, enabling line-by-line streaming and independent feature processing; cjseq provides cat, collect, and filter commands to convert between formats. Findings: CityJSONSeq achieves typical file-size reductions of about $12\%$ (up to $28\%$ in some datasets) and yields order-of-magnitude improvements in processing time and memory usage for local operations, as demonstrated on real datasets (e.g., Helsinki). Significance: this streaming-friendly format enables scalable workflows, easy piping in Unix-like environments, and broad tool reuse with minimal integration effort.

Abstract

We introduce CityJSON Text Sequences (CityJSONSeq in short), a format based on JSON Text Sequences and CityJSON. CityJSONSeq was added to the CityJSON version 2.0 standard to allow us to stream very large 3D city models. The main idea is to decompose a CityJSON dataset into its individual city objects (each building, each tree, etc.) and create several independent JSON objects of a newly defined type: 'CityJSONFeature'. We elaborate on the engineering decisions that were taken to develop CityJSONSeq, we present the open-source software we have developed to convert to and from CityJSONSeq, and we discuss different aspects of the new format, eg filesize, usability, memory footprint, etc. For several use-cases, we consider CityJSONSeq to be a better format than CityJSON because: (1) once serialised it is about 10% more compact; (2) it takes an order of magnitude less time to process; and (3) it uses significantly less memory.

Streaming CityJSON datasets

TL;DR

(up to

in some datasets) and yields order-of-magnitude improvements in processing time and memory usage for local operations, as demonstrated on real datasets (e.g., Helsinki). Significance: this streaming-friendly format enables scalable workflows, easy piping in Unix-like environments, and broad tool reuse with minimal integration effort.

Abstract

Paper Structure (11 sections, 6 figures, 2 tables)

This paper contains 11 sections, 6 figures, 2 tables.

Introduction
Structure of a CityJSON file
Streaming (3D) datasets
CityJSON Text Sequences
Experiments with real-world datasets
Filesize comparison
Number of vertices.
Shared vertices.
Textures.
Processing speed comparison
Discussion and future work

Figures (6)

Figure 1: An example of a CityJSON file. The vertices are stored in a global list, and the position of the vertices in that list are used to represent the boundaries of the geometries (represented by the arrows, many have been left out for clarity).
Figure 2: CityJSON mechanism to flatten out the schema: the city objects are stored in a flat list, and they are linked together with the properties "parents" and "children".
Figure 3: The CityJSONSeq of a CityJSON dataset with two buildings contains three JSON objects: one for the metadata, plus one for each building.
Figure 4: An example of a CityJSONFeature for a Building with a balcony referenced in its "children" property.
Figure 5: An example of a CityJSONSeq stream containing 3 features.
...and 1 more figures

Streaming CityJSON datasets

TL;DR

Abstract

Streaming CityJSON datasets

Authors

TL;DR

Abstract

Table of Contents

Figures (6)