Table of Contents
Fetching ...

Transport-Related Surface Detection with Machine Learning: Analyzing Temporal Trends in Madrid and Vienna

Miguel Ureña Pliego, Rubén Martínez Marín, Nianfang Shi, Takeru Shibayama, Ulrich Leth, Miguel Marchamalo Sacristán

TL;DR

The paper tackles the need for scalable, historical data on transportation-related urban surfaces by introducing an open-data workflow that automatically generates semantic segmentation datasets from WMTS, vector data, and OSM. It leverages a transformer-based encoder (SAM) with a trainable decoder to detect surfaces for cars and pedestrians in Madrid and Vienna, and demonstrates temporal trend analysis (e.g., 2001–2023 in Madrid; 2014–2023 in Vienna) to reveal urban evolution. The study reports meaningful accuracy (e.g., $IoU$ for buildings around $0.73$, with other classes in the mid-range) and shows cross-city generalization via parking-model transfer and SAM-based fine-tuning, suggesting a practical, low-cost tool for municipal analytics. While ground-truth quality and OSM data gaps pose challenges, the approach enables rapid, region-adaptive surface inventories and temporal analytics that can inform urban mobility and policy decisions.

Abstract

This study explores the integration of machine learning into urban aerial image analysis, with a focus on identifying infrastructure surfaces for cars and pedestrians and analyzing historical trends. It emphasizes the transition from convolutional architectures to transformer-based pre-trained models, underscoring their potential in global geospatial analysis. A workflow is presented for automatically generating geospatial datasets, enabling the creation of semantic segmentation datasets from various sources, including WMS/WMTS links, vectorial cartography, and OpenStreetMap (OSM) overpass-turbo requests. The developed code allows a fast dataset generation process for training machine learning models using openly available data without manual labelling. Using aerial imagery and vectorial data from the respective geographical offices of Madrid and Vienna, two datasets were generated for car and pedestrian surface detection. A transformer-based model was trained and evaluated for each city, demonstrating good accuracy values. The historical trend analysis involved applying the trained model to earlier images predating the availability of vectorial data 10 to 20 years, successfully identifying temporal trends in infrastructure for pedestrians and cars across different city areas. This technique is applicable for municipal governments to gather valuable data at a minimal cost.

Transport-Related Surface Detection with Machine Learning: Analyzing Temporal Trends in Madrid and Vienna

TL;DR

The paper tackles the need for scalable, historical data on transportation-related urban surfaces by introducing an open-data workflow that automatically generates semantic segmentation datasets from WMTS, vector data, and OSM. It leverages a transformer-based encoder (SAM) with a trainable decoder to detect surfaces for cars and pedestrians in Madrid and Vienna, and demonstrates temporal trend analysis (e.g., 2001–2023 in Madrid; 2014–2023 in Vienna) to reveal urban evolution. The study reports meaningful accuracy (e.g., for buildings around , with other classes in the mid-range) and shows cross-city generalization via parking-model transfer and SAM-based fine-tuning, suggesting a practical, low-cost tool for municipal analytics. While ground-truth quality and OSM data gaps pose challenges, the approach enables rapid, region-adaptive surface inventories and temporal analytics that can inform urban mobility and policy decisions.

Abstract

This study explores the integration of machine learning into urban aerial image analysis, with a focus on identifying infrastructure surfaces for cars and pedestrians and analyzing historical trends. It emphasizes the transition from convolutional architectures to transformer-based pre-trained models, underscoring their potential in global geospatial analysis. A workflow is presented for automatically generating geospatial datasets, enabling the creation of semantic segmentation datasets from various sources, including WMS/WMTS links, vectorial cartography, and OpenStreetMap (OSM) overpass-turbo requests. The developed code allows a fast dataset generation process for training machine learning models using openly available data without manual labelling. Using aerial imagery and vectorial data from the respective geographical offices of Madrid and Vienna, two datasets were generated for car and pedestrian surface detection. A transformer-based model was trained and evaluated for each city, demonstrating good accuracy values. The historical trend analysis involved applying the trained model to earlier images predating the availability of vectorial data 10 to 20 years, successfully identifying temporal trends in infrastructure for pedestrians and cars across different city areas. This technique is applicable for municipal governments to gather valuable data at a minimal cost.

Paper Structure

This paper contains 23 sections, 14 figures, 3 tables.

Figures (14)

  • Figure 1: The SAM model is able to fulfil only more general and common tasks.
  • Figure 2: Transformer-based model dosovitskiy_image_2020
  • Figure 3: Dataset grid example.
  • Figure 4: The importance of image resolution.
  • Figure 5: Remaining distortion in orthoimages of a building in Madrid ayuntamiento_de_madrid_geoportal_nodate taken by UAV (least distortion), plane, and satellite (most distortion).
  • ...and 9 more figures