Solar Photovoltaic Assessment with Large Language Model
Muhao Guo, Yang Weng
TL;DR
The paper tackles the challenge of detecting, localizing, and counting rooftop solar PV from satellite imagery with transparency and scalability. It introduces PVAL, a framework that combines data engineering, task-decomposed prompting, schema-guided JSON outputs, few-shot prompting, and targeted fine-tuning of a multimodal LLM to deliver structured predictions with likelihood and confidence metrics. Empirical results show that prompted PVAL is competitive, while fine-tuned PVAL achieves superior accuracy (e.g., solar F1-scores around 82–83% with high recall) across six U.S. regions and maintains performance in cross-continental tests. The approach reduces manual labeling needs via a confidence-driven auto-labeling mechanism, enabling scalable PV inventories and enabling PV-aware grid operations such as state estimation and load forecasting in ADNs and microgrids. Overall, PVAL demonstrates that constraining multimodal LLMs with a spatial schema and uncertainty signals can be a cost-effective, generalizable tool for large-scale renewable energy integration and grid resilience.
Abstract
Accurate detection and localization of solar photovoltaic (PV) panels in satellite imagery is essential for optimizing microgrids and active distribution networks (ADNs), which are critical components of renewable energy systems. Existing methods lack transparency regarding their underlying algorithms or training datasets, rely on large, high-quality PV training data, and struggle to generalize to new geographic regions or varied environmental conditions without extensive re-training. These limitations lead to inconsistent detection outcomes, hindering large-scale deployment and data-driven grid optimization. In this paper, we investigate how large language models (LLMs) can be leveraged to overcome these challenges. Despite their promise, LLMs face several challenges in solar panel detection, including difficulties with multi-step logical processes, inconsistent output formatting, frequent misclassification of visually similar objects (e.g., shadows, parking lots), and low accuracy in complex tasks such as spatial localization and quantification. To overcome these issues, we propose the PV Assessment with LLMs (PVAL) framework, which incorporates task decomposition for more efficient workflows, output standardization for consistent and scalable formatting, few-shot prompting to enhance classification accuracy, and fine-tuning using curated PV datasets with detailed annotations. PVAL ensures transparency, scalability, and adaptability across heterogeneous datasets while minimizing computational overhead. By combining open-source accessibility with robust methodologies, PVAL establishes an automated and reproducible pipeline for solar panel detection, paving the way for large-scale renewable energy integration and optimized grid management.
