Table of Contents
Fetching ...

OpenUAS: Embeddings of Cities in Japan with Anchor Data for Cross-city Analysis of Area Usage Patterns

Naoki Tamura, Kazuyuki Shoji, Shin Katayama, Kenta Urano, Takuro Yonezawa, Nobuo Kawaguchi

TL;DR

This work develops an anchoring method that establishes anchors within a shared embedding space that could not embed areas from different cities and periods into the same embedding space without sharing raw location data.

Abstract

We publicly release OpenUAS, a dataset of area embeddings based on urban usage patterns, including embeddings for over 1.3 million 50-meter square meshes covering a total area of 3,300 square kilometers. This dataset is valuable for analyzing area functions in fields such as market analysis, urban planning, transportation infrastructure, and infection prediction. It captures the characteristics of each area in the city, such as office districts and residential areas, by employing an area embedding technique that utilizes location information typically obtained by GPS. Numerous area embedding techniques have been proposed, and while the public release of such embedding datasets is technically feasible, it has not been realized. One reason for this is that previous methods could not embed areas from different cities and periods into the same embedding space without sharing raw location data. We address this issue by developing an anchoring method that establishes anchors within a shared embedding space. We publicly release this anchor dataset along with area embedding datasets from several periods in eight major Japanese cities.

OpenUAS: Embeddings of Cities in Japan with Anchor Data for Cross-city Analysis of Area Usage Patterns

TL;DR

This work develops an anchoring method that establishes anchors within a shared embedding space that could not embed areas from different cities and periods into the same embedding space without sharing raw location data.

Abstract

We publicly release OpenUAS, a dataset of area embeddings based on urban usage patterns, including embeddings for over 1.3 million 50-meter square meshes covering a total area of 3,300 square kilometers. This dataset is valuable for analyzing area functions in fields such as market analysis, urban planning, transportation infrastructure, and infection prediction. It captures the characteristics of each area in the city, such as office districts and residential areas, by employing an area embedding technique that utilizes location information typically obtained by GPS. Numerous area embedding techniques have been proposed, and while the public release of such embedding datasets is technically feasible, it has not been realized. One reason for this is that previous methods could not embed areas from different cities and periods into the same embedding space without sharing raw location data. We address this issue by developing an anchoring method that establishes anchors within a shared embedding space. We publicly release this anchor dataset along with area embedding datasets from several periods in eight major Japanese cities.
Paper Structure (20 sections, 5 equations, 13 figures, 4 tables)

This paper contains 20 sections, 5 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: Area2Vec architecture (Adapted from Shoji et al., 2021area2vec).
  • Figure 2: Anchor data is sampled from location dataset. Data from multiple cities and periods are embedded and clustered. Samples from each cluster are used as anchor data, and reference values are obtained as embeddings for each anchor.
  • Figure 3: Embedding with anchors. Proprietary data is mixed with anchor data to learn embeddings and fix the anchor embeddings to common values. Embeddings for proprietary data are aligned based on the anchors.
  • Figure 4: Area usage patterns of each 5 cluster, including all cities and periods($D^*$).
  • Figure 5: Clustering results of UAS in each city.
  • ...and 8 more figures