Short-Term Solar Irradiance Forecasting Under Data Transmission Constraints
Joshua Edward Hammond, Ricardo A. Lara Orozco, Michael Baldea, Brian A. Korgel
TL;DR
This work tackles near-term solar irradiance forecasting under data transmission constraints by proposing a data-parsimonious CNN-LSTM that uses scalar sky-camera features and an optional noise input to capture unmeasured disturbances. The novelty lies in predicting the deviation from a long-term baseline (the persistence of cloudiness, POC) rather than irradiance itself, effectively de-trending the problem and improving forecast accuracy. The method employs three rolling training steps to optimize time representations, input sequence length, and feature importance, with a permutation-based analysis guiding feature selection. Empirically, the final model achieves about $75$ W/m$^2$ MAE, substantially better than the POC baseline of $134.35$ W/m$^2$, while requiring orders of magnitude less data than end-to-end image-based approaches, highlighting its practicality for bandwidth-limited, remote solar sites.
Abstract
We report a data-parsimonious machine learning model for short-term forecasting of solar irradiance. The model inputs include sky camera images that are reduced to scalar features to meet data transmission constraints. The output irradiance values are transformed to focus on unknown short-term dynamics. Inspired by control theory, a noise input is used to reflect unmeasured variables and is shown to improve model predictions, often considerably. Five years of data from the NREL Solar Radiation Research Laboratory were used to create three rolling train-validate sets and determine the best representations for time, the optimal span of input measurements, and the most impactful model input data (features). For the chosen test data, the model achieves a mean absolute error of 74.34 $W/m^2$ compared to a baseline 134.35 $W/m^2$ using the persistence of cloudiness model.
