Table of Contents
Fetching ...

Solar Photovoltaic Assessment with Large Language Model

Muhao Guo, Yang Weng

TL;DR

The paper tackles the challenge of detecting, localizing, and counting rooftop solar PV from satellite imagery with transparency and scalability. It introduces PVAL, a framework that combines data engineering, task-decomposed prompting, schema-guided JSON outputs, few-shot prompting, and targeted fine-tuning of a multimodal LLM to deliver structured predictions with likelihood and confidence metrics. Empirical results show that prompted PVAL is competitive, while fine-tuned PVAL achieves superior accuracy (e.g., solar F1-scores around 82–83% with high recall) across six U.S. regions and maintains performance in cross-continental tests. The approach reduces manual labeling needs via a confidence-driven auto-labeling mechanism, enabling scalable PV inventories and enabling PV-aware grid operations such as state estimation and load forecasting in ADNs and microgrids. Overall, PVAL demonstrates that constraining multimodal LLMs with a spatial schema and uncertainty signals can be a cost-effective, generalizable tool for large-scale renewable energy integration and grid resilience.

Abstract

Accurate detection and localization of solar photovoltaic (PV) panels in satellite imagery is essential for optimizing microgrids and active distribution networks (ADNs), which are critical components of renewable energy systems. Existing methods lack transparency regarding their underlying algorithms or training datasets, rely on large, high-quality PV training data, and struggle to generalize to new geographic regions or varied environmental conditions without extensive re-training. These limitations lead to inconsistent detection outcomes, hindering large-scale deployment and data-driven grid optimization. In this paper, we investigate how large language models (LLMs) can be leveraged to overcome these challenges. Despite their promise, LLMs face several challenges in solar panel detection, including difficulties with multi-step logical processes, inconsistent output formatting, frequent misclassification of visually similar objects (e.g., shadows, parking lots), and low accuracy in complex tasks such as spatial localization and quantification. To overcome these issues, we propose the PV Assessment with LLMs (PVAL) framework, which incorporates task decomposition for more efficient workflows, output standardization for consistent and scalable formatting, few-shot prompting to enhance classification accuracy, and fine-tuning using curated PV datasets with detailed annotations. PVAL ensures transparency, scalability, and adaptability across heterogeneous datasets while minimizing computational overhead. By combining open-source accessibility with robust methodologies, PVAL establishes an automated and reproducible pipeline for solar panel detection, paving the way for large-scale renewable energy integration and optimized grid management.

Solar Photovoltaic Assessment with Large Language Model

TL;DR

The paper tackles the challenge of detecting, localizing, and counting rooftop solar PV from satellite imagery with transparency and scalability. It introduces PVAL, a framework that combines data engineering, task-decomposed prompting, schema-guided JSON outputs, few-shot prompting, and targeted fine-tuning of a multimodal LLM to deliver structured predictions with likelihood and confidence metrics. Empirical results show that prompted PVAL is competitive, while fine-tuned PVAL achieves superior accuracy (e.g., solar F1-scores around 82–83% with high recall) across six U.S. regions and maintains performance in cross-continental tests. The approach reduces manual labeling needs via a confidence-driven auto-labeling mechanism, enabling scalable PV inventories and enabling PV-aware grid operations such as state estimation and load forecasting in ADNs and microgrids. Overall, PVAL demonstrates that constraining multimodal LLMs with a spatial schema and uncertainty signals can be a cost-effective, generalizable tool for large-scale renewable energy integration and grid resilience.

Abstract

Accurate detection and localization of solar photovoltaic (PV) panels in satellite imagery is essential for optimizing microgrids and active distribution networks (ADNs), which are critical components of renewable energy systems. Existing methods lack transparency regarding their underlying algorithms or training datasets, rely on large, high-quality PV training data, and struggle to generalize to new geographic regions or varied environmental conditions without extensive re-training. These limitations lead to inconsistent detection outcomes, hindering large-scale deployment and data-driven grid optimization. In this paper, we investigate how large language models (LLMs) can be leveraged to overcome these challenges. Despite their promise, LLMs face several challenges in solar panel detection, including difficulties with multi-step logical processes, inconsistent output formatting, frequent misclassification of visually similar objects (e.g., shadows, parking lots), and low accuracy in complex tasks such as spatial localization and quantification. To overcome these issues, we propose the PV Assessment with LLMs (PVAL) framework, which incorporates task decomposition for more efficient workflows, output standardization for consistent and scalable formatting, few-shot prompting to enhance classification accuracy, and fine-tuning using curated PV datasets with detailed annotations. PVAL ensures transparency, scalability, and adaptability across heterogeneous datasets while minimizing computational overhead. By combining open-source accessibility with robust methodologies, PVAL establishes an automated and reproducible pipeline for solar panel detection, paving the way for large-scale renewable energy integration and optimized grid management.

Paper Structure

This paper contains 33 sections, 8 equations, 12 figures, 5 tables, 1 algorithm.

Figures (12)

  • Figure 1: Workflow of PVAL: solar panel detection, localization, and counting using LLMs. The framework combines data engineering, prompt engineering, and fine-tuning to improve accuracy, with outputs including predictions with likelihood and confidence values.
  • Figure 2: Illustration of all possible solar panel locations as defined in the work, accompanied by representative image examples for each category. The locations include Top, Bottom, Left, Right, Center, Top-left, Top-right, Bottom-left, Bottom-right, and "NA" (indicating no solar panels).
  • Figure 3: Sample data from six regions across the United States, with tables displaying the weighted average precision, recall, and F1-score for each region.
  • Figure 4: Illustration of the dataset structure. Each rooftop image is divided into tiles and labeled with solar panel presence, location and quantity information. Example entries are shown with corresponding annotations.
  • Figure 5: Radar chart comparing the performance of various models on solar panel detection. Metrics include precision, recall, F1-score, and accuracy for "No Solar," "Solar," and weighted averages. "Prompted PVAL" and "Fine-tuned PVAL" are highlighted in pink and red, respectively, to emphasize their performance.
  • ...and 7 more figures