Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis

Roshni Agrawal; Sivakumar Subramanian; Venkataramana Runkana

Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis

Roshni Agrawal, Sivakumar Subramanian, Venkataramana Runkana

TL;DR

This work tackles the clustering of solar irradiance patterns to support forecasting and planning by proposing a two-level framework. Level-1 identifies contiguous seasons from clear-sky irradiance (CSI) features, while Level-2 clusters daily irradiance within each season using the Daily Irradiance Index $\beta$, comparing it against time-series distances $\text{ED}$ and $\text{DTW}$. Across two US locations (Golden, CO and Hawaii), $\beta$-based clustering consistently yields distinct High/Medium/Low irradiance groups with superior cluster validity metrics, whereas $\text{ED}$ and especially $\text{DTW}$ underperform, and $\beta$ remains effective for annual data. The study also analyzes day-to-day transition probabilities between irradiance levels, offering practical inputs for season-specific forecasting models and PV planning. Overall, the $\beta$-based two-level clustering provides a robust, scalable approach for irradiance analysis that can generalize to other sites and support more reliable solar-energy analytics.

Abstract

Solar irradiance clustering can enhance solar power capacity planning and help improve forecasting models by identifying similar irradiance patterns influenced by seasonal and weather changes. In this study, we adopt an efficient two-level clustering approach to automatically identify seasons using the clear sky irradiance in first level and subsequently to identify daily cloud level as clear, cloudy and partly cloudy within each season in second level. In the second level of clustering, three methods are compared, namely, Daily Irradiance Index (DII or $β$), Euclidean Distance (ED), and Dynamic Time Warping (DTW) distance. The DII is computed as the ratio of time integral of measured irradiance to time integral of the clear sky irradiance. The identified clusters were compared quantitatively using established clustering metrics and qualitatively by comparing the mean irradiance profiles. The results clearly establish the superiority of the $β$-based clustering approach as the leader, setting a new benchmark for solar irradiance clustering studies. Moreover, $β$-based clustering remains effective even for annual data unlike the time-series methods which suffer significant performance degradation. Interestingly, contrary to expectations, ED-based clustering outperforms the more compute-intensive DTW distance-based clustering. The method has been rigorously validated using data from two distinct US locations, demonstrating robust scalability for larger datasets and potential applicability for other locations.

Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis

TL;DR

, comparing it against time-series distances

and

. Across two US locations (Golden, CO and Hawaii),

-based clustering consistently yields distinct High/Medium/Low irradiance groups with superior cluster validity metrics, whereas

and especially

underperform, and

remains effective for annual data. The study also analyzes day-to-day transition probabilities between irradiance levels, offering practical inputs for season-specific forecasting models and PV planning. Overall, the

-based two-level clustering provides a robust, scalable approach for irradiance analysis that can generalize to other sites and support more reliable solar-energy analytics.

Abstract

), Euclidean Distance (ED), and Dynamic Time Warping (DTW) distance. The DII is computed as the ratio of time integral of measured irradiance to time integral of the clear sky irradiance. The identified clusters were compared quantitatively using established clustering metrics and qualitatively by comparing the mean irradiance profiles. The results clearly establish the superiority of the

-based clustering approach as the leader, setting a new benchmark for solar irradiance clustering studies. Moreover,

-based clustering remains effective even for annual data unlike the time-series methods which suffer significant performance degradation. Interestingly, contrary to expectations, ED-based clustering outperforms the more compute-intensive DTW distance-based clustering. The method has been rigorously validated using data from two distinct US locations, demonstrating robust scalability for larger datasets and potential applicability for other locations.

Paper Structure (9 sections, 1 equation, 9 figures, 6 tables)

This paper contains 9 sections, 1 equation, 9 figures, 6 tables.

Introduction
Methodology
Cluster Quality Measures
Irradiance Database
Results and Discussion
Level-1: Identification of Seasons
Efficacy of DII ($\beta$)
Level-2: Clustering with $\beta$, and time-series distance methods
Summary and Conclusion

Figures (9)

Figure 1: Two-level clustering framework for irradiance analysis.
Figure 2: (a) Comparison of measured irradiance profiles of 17 February 2017 with the measured irradiance, clear sky irradiance superimposed with total cloud cover measurements (right Y-axis), (b) - (e) All-sky cloud cover images at given sampling time with raw image on left and corresponding cloud decision image on the right with thin cloud cover [%] and opaque cloud cover [%]
Figure 3: From the first level clustering for Hawaii, (a) identified seasonal boundaries, and (b) identified seasonal clusters.
Figure 4: Scatter plot of $\beta$ and daily mean of total cloud cover obtained from the measurements. Profiles of selected days, shown as pink, orange, and green-filled circles, are plotted in Figure \ref{['fig:betasamplesHighBeta']}, \ref{['fig:betasamplesMedBeta']}, and \ref{['fig:betasamplesLowBeta']}, respectively.
Figure 5: Comparison of measured irradiance profiles of selected days with the clear sky irradiance superimposed with cloud cover measurements (right Y-axis): (a) low cloud cover; high $\beta$, (b) medium cloud cover; medium $\beta$, and (c) high cloud cover; low $\beta$. The selected days are represented by pink, orange, and green-filled circles in Figure \ref{['fig:betasdistribution']}.
...and 4 more figures

Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis

TL;DR

Abstract

Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (9)