Two-level Solar Irradiance Clustering with Season Identification: A Comparative Analysis
Roshni Agrawal, Sivakumar Subramanian, Venkataramana Runkana
TL;DR
This work tackles the clustering of solar irradiance patterns to support forecasting and planning by proposing a two-level framework. Level-1 identifies contiguous seasons from clear-sky irradiance (CSI) features, while Level-2 clusters daily irradiance within each season using the Daily Irradiance Index $\beta$, comparing it against time-series distances $\text{ED}$ and $\text{DTW}$. Across two US locations (Golden, CO and Hawaii), $\beta$-based clustering consistently yields distinct High/Medium/Low irradiance groups with superior cluster validity metrics, whereas $\text{ED}$ and especially $\text{DTW}$ underperform, and $\beta$ remains effective for annual data. The study also analyzes day-to-day transition probabilities between irradiance levels, offering practical inputs for season-specific forecasting models and PV planning. Overall, the $\beta$-based two-level clustering provides a robust, scalable approach for irradiance analysis that can generalize to other sites and support more reliable solar-energy analytics.
Abstract
Solar irradiance clustering can enhance solar power capacity planning and help improve forecasting models by identifying similar irradiance patterns influenced by seasonal and weather changes. In this study, we adopt an efficient two-level clustering approach to automatically identify seasons using the clear sky irradiance in first level and subsequently to identify daily cloud level as clear, cloudy and partly cloudy within each season in second level. In the second level of clustering, three methods are compared, namely, Daily Irradiance Index (DII or $β$), Euclidean Distance (ED), and Dynamic Time Warping (DTW) distance. The DII is computed as the ratio of time integral of measured irradiance to time integral of the clear sky irradiance. The identified clusters were compared quantitatively using established clustering metrics and qualitatively by comparing the mean irradiance profiles. The results clearly establish the superiority of the $β$-based clustering approach as the leader, setting a new benchmark for solar irradiance clustering studies. Moreover, $β$-based clustering remains effective even for annual data unlike the time-series methods which suffer significant performance degradation. Interestingly, contrary to expectations, ED-based clustering outperforms the more compute-intensive DTW distance-based clustering. The method has been rigorously validated using data from two distinct US locations, demonstrating robust scalability for larger datasets and potential applicability for other locations.
