Table of Contents
Fetching ...

BuildingView: Constructing Urban Building Exteriors Databases with Street View Imagery and Multimodal Large Language Mode

Zongrong Li, Yunlei Su, Hongrong Wang, Wufan Zhao

TL;DR

This work addresses the lack of comprehensive, globally scalable databases of urban building exteriors by integrating high-resolution Street View imagery with OpenStreetMap data through the Overpass API. It introduces BuildingView, a systematic workflow that combines a literature review, building sampling, and multimodal annotation (via prompts for ChatGPT-4O) to extract and organize indicators related to energy efficiency, environmental sustainability, and human-centric design. The approach is validated on NYC, Amsterdam, and Singapore, achieving high generation rates for most indicators (>$99.9\%$) and strong predictive performance for select metrics (e.g., Floor-to-Floor Height $R^2=0.83$, WWR $R^2=0.75$), with a total annotation cost of roughly $220 and scalable, parallel processing. The resulting open, extensible database and accompanying workflow enable cross-city urban analytics for planning, architecture, and policy, and set the stage for global expansion and crowd-assisted data maintenance.

Abstract

Urban Building Exteriors are increasingly important in urban analytics, driven by advancements in Street View Imagery and its integration with urban research. Multimodal Large Language Models (LLMs) offer powerful tools for urban annotation, enabling deeper insights into urban environments. However, challenges remain in creating accurate and detailed urban building exterior databases, identifying critical indicators for energy efficiency, environmental sustainability, and human-centric design, and systematically organizing these indicators. To address these challenges, we propose BuildingView, a novel approach that integrates high-resolution visual data from Google Street View with spatial information from OpenStreetMap via the Overpass API. This research improves the accuracy of urban building exterior data, identifies key sustainability and design indicators, and develops a framework for their extraction and categorization. Our methodology includes a systematic literature review, building and Street View sampling, and annotation using the ChatGPT-4O API. The resulting database, validated with data from New York City, Amsterdam, and Singapore, provides a comprehensive tool for urban studies, supporting informed decision-making in urban planning, architectural design, and environmental policy. The code for BuildingView is available at https://github.com/Jasper0122/BuildingView.

BuildingView: Constructing Urban Building Exteriors Databases with Street View Imagery and Multimodal Large Language Mode

TL;DR

This work addresses the lack of comprehensive, globally scalable databases of urban building exteriors by integrating high-resolution Street View imagery with OpenStreetMap data through the Overpass API. It introduces BuildingView, a systematic workflow that combines a literature review, building sampling, and multimodal annotation (via prompts for ChatGPT-4O) to extract and organize indicators related to energy efficiency, environmental sustainability, and human-centric design. The approach is validated on NYC, Amsterdam, and Singapore, achieving high generation rates for most indicators (>) and strong predictive performance for select metrics (e.g., Floor-to-Floor Height , WWR ), with a total annotation cost of roughly $220 and scalable, parallel processing. The resulting open, extensible database and accompanying workflow enable cross-city urban analytics for planning, architecture, and policy, and set the stage for global expansion and crowd-assisted data maintenance.

Abstract

Urban Building Exteriors are increasingly important in urban analytics, driven by advancements in Street View Imagery and its integration with urban research. Multimodal Large Language Models (LLMs) offer powerful tools for urban annotation, enabling deeper insights into urban environments. However, challenges remain in creating accurate and detailed urban building exterior databases, identifying critical indicators for energy efficiency, environmental sustainability, and human-centric design, and systematically organizing these indicators. To address these challenges, we propose BuildingView, a novel approach that integrates high-resolution visual data from Google Street View with spatial information from OpenStreetMap via the Overpass API. This research improves the accuracy of urban building exterior data, identifies key sustainability and design indicators, and develops a framework for their extraction and categorization. Our methodology includes a systematic literature review, building and Street View sampling, and annotation using the ChatGPT-4O API. The resulting database, validated with data from New York City, Amsterdam, and Singapore, provides a comprehensive tool for urban studies, supporting informed decision-making in urban planning, architectural design, and environmental policy. The code for BuildingView is available at https://github.com/Jasper0122/BuildingView.
Paper Structure (21 sections, 6 figures, 2 tables)

This paper contains 21 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: BuildingView Workflow: (A) Indicator Review; (B) Sampling; (C) Annotation with ChatGPT-4.0.
  • Figure 2: Literature Review of Building Exteriors
  • Figure 3: The Distribution of Sampling Points
  • Figure 4: Representative Street View Images from Selected Cities
  • Figure 5: Visualization of Result: The Number of Parking Lots (A), Trees (b) and Window-to-Wall Ratio (c)
  • ...and 1 more figures