Solar Multimodal Transformer: Intraday Solar Irradiance Predictor using Public Cameras and Time Series
Yanan Niu, Roy Sarkis, Demetri Psaltis, Mario Paolone, Christophe Moser, Luisa Lambertini
TL;DR
The paper addresses intraday solar irradiance forecasting in the 10-minute-to-hours range by introducing the Solar Multimodal Transformer (SMT), which fuses single-frame public camera imagery with historical GHI time series through an early-fusion transformer. A normalization step scales GHI by the daily maximum clear-sky value to emphasize sky clearness, improving forecast accuracy. SMT, including lightweight image-time-series integration and optional CNN/U-net hybrids, achieves a 25.95% RMSE reduction compared with Solcast over a 12-day test, and ablation/attention analyses provide insight into when and how each modality contributes. The work demonstrates strong practical potential for scalable, camera-agnostic solar forecasting with broad applicability in energy markets and grid planning.
Abstract
Accurate intraday solar irradiance forecasting is crucial for optimizing dispatch planning and electricity trading. For this purpose, we introduce a novel and effective approach that includes three distinguishing components from the literature: 1) the uncommon use of single-frame public camera imagery; 2) solar irradiance time series scaled with a proposed normalization step, which boosts performance; and 3) a lightweight multimodal model, called Solar Multimodal Transformer (SMT), that delivers accurate short-term solar irradiance forecasting by combining images and scaled time series. Benchmarking against Solcast, a leading solar forecasting service provider, our model improved prediction accuracy by 25.95%. Our approach allows for easy adaptation to various camera specifications, offering broad applicability for real-world solar forecasting challenges.
