Table of Contents
Fetching ...

Solar PV Installation Potential Assessment on Building Facades Based on Vision and Language Foundation Models

Ruyu Liu, Dongxu Zhuang, Jianhua Zhang, Arega Getaneh Abate, Per Sieverts Nielsen, Ben Wang, Xiufeng Liu

TL;DR

SF-SPA introduces a four-stage pipeline to quantify building-façade PV potential from a single street-view image, leveraging semantic rectification, zero-shot façade parsing, and LLM-guided spatial reasoning to output installable PV layouts and energy yields. The approach combines geometry, vision-language models, and energy simulation (pvlib) to produce metrically valid PV layouts without 3D data or domain-specific training, validated on 80 buildings across four countries with an average area error of 6.2% (±2.8%) and ~100 s per building. Key contributions include: (i) semantics-guided geometric rectification using semantic keypoints, (ii) zero-shot façade parsing with vision-language models, (iii) a structured prompt chain for LLM-based PV layout reasoning, and (iv) pvlib-based irradiance and energy simulations incorporating weather data and module parameters. The framework demonstrates practical utility for urban energy planning and BIPV deployment, enabling scalable, city-wide façade PV screening, while acknowledging limitations related to 2D data, rectification failures, and LLM latency, with future work targeting 3D data integration and automated scaling.

Abstract

Building facades represent a significant untapped resource for solar energy generation in dense urban environments, yet assessing their photovoltaic (PV) potential remains challenging due to complex geometries and semantic com ponents. This study introduces SF-SPA (Semantic Facade Solar-PV Assessment), an automated framework that transforms street-view photographs into quantitative PV deployment assessments. The approach combines com puter vision and artificial intelligence techniques to address three key challenges: perspective distortion correction, semantic understanding of facade elements, and spatial reasoning for PV layout optimization. Our four-stage pipeline processes images through geometric rectification, zero-shot semantic segmentation, Large Language Model (LLM) guided spatial reasoning, and energy simulation. Validation across 80 buildings in four countries demonstrates ro bust performance with mean area estimation errors of 6.2% ± 2.8% compared to expert annotations. The auto mated assessment requires approximately 100 seconds per building, a substantial gain in efficiency over manual methods. Simulated energy yield predictions confirm the method's reliability and applicability for regional poten tial studies, urban energy planning, and building-integrated photovoltaic (BIPV) deployment. Code is available at: https:github.com/CodeAXu/Solar-PV-Installation

Solar PV Installation Potential Assessment on Building Facades Based on Vision and Language Foundation Models

TL;DR

SF-SPA introduces a four-stage pipeline to quantify building-façade PV potential from a single street-view image, leveraging semantic rectification, zero-shot façade parsing, and LLM-guided spatial reasoning to output installable PV layouts and energy yields. The approach combines geometry, vision-language models, and energy simulation (pvlib) to produce metrically valid PV layouts without 3D data or domain-specific training, validated on 80 buildings across four countries with an average area error of 6.2% (±2.8%) and ~100 s per building. Key contributions include: (i) semantics-guided geometric rectification using semantic keypoints, (ii) zero-shot façade parsing with vision-language models, (iii) a structured prompt chain for LLM-based PV layout reasoning, and (iv) pvlib-based irradiance and energy simulations incorporating weather data and module parameters. The framework demonstrates practical utility for urban energy planning and BIPV deployment, enabling scalable, city-wide façade PV screening, while acknowledging limitations related to 2D data, rectification failures, and LLM latency, with future work targeting 3D data integration and automated scaling.

Abstract

Building facades represent a significant untapped resource for solar energy generation in dense urban environments, yet assessing their photovoltaic (PV) potential remains challenging due to complex geometries and semantic com ponents. This study introduces SF-SPA (Semantic Facade Solar-PV Assessment), an automated framework that transforms street-view photographs into quantitative PV deployment assessments. The approach combines com puter vision and artificial intelligence techniques to address three key challenges: perspective distortion correction, semantic understanding of facade elements, and spatial reasoning for PV layout optimization. Our four-stage pipeline processes images through geometric rectification, zero-shot semantic segmentation, Large Language Model (LLM) guided spatial reasoning, and energy simulation. Validation across 80 buildings in four countries demonstrates ro bust performance with mean area estimation errors of 6.2% ± 2.8% compared to expert annotations. The auto mated assessment requires approximately 100 seconds per building, a substantial gain in efficiency over manual methods. Simulated energy yield predictions confirm the method's reliability and applicability for regional poten tial studies, urban energy planning, and building-integrated photovoltaic (BIPV) deployment. Code is available at: https:github.com/CodeAXu/Solar-PV-Installation

Paper Structure

This paper contains 28 sections, 11 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: The Semantic Façade Solar-PV Assessment (SF-SPA) pipeline. The process consists of four primary stages from image acquisition to energy simulation.
  • Figure 2: Predicted PV installation patches (colored rectangles) overlaid on representative façades from the dataset: (a) Hangzhou, China; (b) Tianjin, China; (c) Ankara, Turkey; (d) Ålesund, Norway. This illustrates the method's adaptability to diverse architectural styles.
  • Figure 3: End-to-end workflow demonstration for facade PV potential assessment on a street in Alesund, Norway (62.47°N, 6.15°E), showing (from left to right) original full view, rectified facade segment, semantic segmentation, LLM-suggested PV installation, and illustrative monthly/daily PV potential graphs.
  • Figure 4: Comparative analysis workflow illustration for facade PV potential assessment on a street in Tianjin, China (39.08°N, 117.20°E), showcasing similar stages as Figure \ref{['fig:alesund_workflow_detail']} and highlighting adaptability to different urban forms and climatic conditions.
  • Figure 5: Impact of Geometric Rectification on PV-Layout Generation Quality. Left: Processing an unrectified image leads to skewed masks and invalid PV layouts. Right: The rectified image yields clean semantic segmentation and accurate, installable PV layouts.
  • ...and 2 more figures