Streaming CityJSON datasets
Hugo Ledoux, Gina Stavropoulou, Balázs Dukai
TL;DR
Problem: streaming very large 3D city models encoded in CityJSON is challenging due to global vertex indexing. Approach: CityJSONSeq decomposes datasets into per-feature CityJSONFeature objects, serialized as NDJSON with a CityJSON header, enabling line-by-line streaming and independent feature processing; cjseq provides cat, collect, and filter commands to convert between formats. Findings: CityJSONSeq achieves typical file-size reductions of about $12\%$ (up to $28\%$ in some datasets) and yields order-of-magnitude improvements in processing time and memory usage for local operations, as demonstrated on real datasets (e.g., Helsinki). Significance: this streaming-friendly format enables scalable workflows, easy piping in Unix-like environments, and broad tool reuse with minimal integration effort.
Abstract
We introduce CityJSON Text Sequences (CityJSONSeq in short), a format based on JSON Text Sequences and CityJSON. CityJSONSeq was added to the CityJSON version 2.0 standard to allow us to stream very large 3D city models. The main idea is to decompose a CityJSON dataset into its individual city objects (each building, each tree, etc.) and create several independent JSON objects of a newly defined type: 'CityJSONFeature'. We elaborate on the engineering decisions that were taken to develop CityJSONSeq, we present the open-source software we have developed to convert to and from CityJSONSeq, and we discuss different aspects of the new format, eg filesize, usability, memory footprint, etc. For several use-cases, we consider CityJSONSeq to be a better format than CityJSON because: (1) once serialised it is about 10% more compact; (2) it takes an order of magnitude less time to process; and (3) it uses significantly less memory.
