Table of Contents
Fetching ...

GloSoFarID: Global multispectral dataset for Solar Farm IDentification in satellite imagery

Zhiyuan Yang, Ryan Rad

TL;DR

GloSoFarID tackles the challenge of globally mapping solar-farm expansion by introducing a global, multispectral satellite dataset with $13$ bands and $10$ m resolution, spanning $2021$-$2023$. The authors implement a three-stage construction pipeline—initial data assembly, SOTA model training, and ensemble-based new-data generation with rigorous quality control—to produce a high-quality benchmark dataset of $13{,}703$ samples ($256 \times 256$) across diverse regions. Benchmarking FCN, Half-UNet, and U-Net establishes baseline segmentation performance (IoU up to $79.3\%$ and F-score up to $87.8\%$) and demonstrates the dataset’s suitability for global solar-farm identification. Overall, GloSoFarID provides a timely, rich resource to drive machine learning-based monitoring of solar energy infrastructure and support sustainable energy planning.

Abstract

Solar Photovoltaic (PV) technology is increasingly recognized as a pivotal solution in the global pursuit of clean and renewable energy. This technology addresses the urgent need for sustainable energy alternatives by converting solar power into electricity without greenhouse gas emissions. It not only curtails global carbon emissions but also reduces reliance on finite, non-renewable energy sources. In this context, monitoring solar panel farms becomes essential for understanding and facilitating the worldwide shift toward clean energy. This study contributes to this effort by developing the first comprehensive global dataset of multispectral satellite imagery of solar panel farms. This dataset is intended to form the basis for training robust machine learning models, which can accurately map and analyze the expansion and distribution of solar panel farms globally. The insights gained from this endeavor will be instrumental in guiding informed decision-making for a sustainable energy future. https://github.com/yzyly1992/GloSoFarID

GloSoFarID: Global multispectral dataset for Solar Farm IDentification in satellite imagery

TL;DR

GloSoFarID tackles the challenge of globally mapping solar-farm expansion by introducing a global, multispectral satellite dataset with bands and m resolution, spanning -. The authors implement a three-stage construction pipeline—initial data assembly, SOTA model training, and ensemble-based new-data generation with rigorous quality control—to produce a high-quality benchmark dataset of samples () across diverse regions. Benchmarking FCN, Half-UNet, and U-Net establishes baseline segmentation performance (IoU up to and F-score up to ) and demonstrates the dataset’s suitability for global solar-farm identification. Overall, GloSoFarID provides a timely, rich resource to drive machine learning-based monitoring of solar energy infrastructure and support sustainable energy planning.

Abstract

Solar Photovoltaic (PV) technology is increasingly recognized as a pivotal solution in the global pursuit of clean and renewable energy. This technology addresses the urgent need for sustainable energy alternatives by converting solar power into electricity without greenhouse gas emissions. It not only curtails global carbon emissions but also reduces reliance on finite, non-renewable energy sources. In this context, monitoring solar panel farms becomes essential for understanding and facilitating the worldwide shift toward clean energy. This study contributes to this effort by developing the first comprehensive global dataset of multispectral satellite imagery of solar panel farms. This dataset is intended to form the basis for training robust machine learning models, which can accurately map and analyze the expansion and distribution of solar panel farms globally. The insights gained from this endeavor will be instrumental in guiding informed decision-making for a sustainable energy future. https://github.com/yzyly1992/GloSoFarID
Paper Structure (11 sections, 5 figures, 3 tables)

This paper contains 11 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Examples from our GloSoFarID dataset. Each row represents one sample with all $13$ bands along with the ground truth mask where the gray color indicates the solar farm area. See TABLE \ref{['tab:bands-info']} for more details.
  • Figure 2: Dataset Construction Pipeline
  • Figure 3: Global Distribution of Data Points. Sparse is represented in light orange, while dense is represented in dark orange.
  • Figure 4: Architecture of FCN, Half-UNet, and U-Net
  • Figure 5: Qualitative Results of FCN, Half-UNet, and U-Net Models. The first row displays the test samples in true color. Rows $2$-$4$ depict the prediction masks for solar farm areas. In this representation, True Positive is shown as white, True Negative as black, False Positive as red, and False Negative as blue.