FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting
Ziqing Ma, Wenwei Wang, Tian Zhou, Chao Chen, Bingqing Peng, Liang Sun, Rong Jin
TL;DR
The paper addresses the challenge of accurate day-ahead solar power forecasting for data-scarce plants by introducing FusionSF, a vector-quantized, multi-modal Transformer framework that fuses historical power data, satellite imagery, and future NWP. It leverages Rotary Positional Encoding, patching, and residual Vector Quantization, with a Cross Transformer to integrate modalities and a decoder to generate next-step predictions. The authors release the MMSP dataset, demonstrate strong zero-shot performance, and achieve real-world impact by deploying FusionSF across >300 plants totaling over $15$ GW via the eForecaster platform, showing notable improvements over state-of-the-art baselines. This work highlights the value of aligning heterogeneous data sources in the solar forecasting domain, enabling more robust grid integration and potential cost savings through improved forecast accuracy.
Abstract
Accurate solar power forecasting is crucial to integrate photovoltaic plants into the electric grid, schedule and secure the power grid safety. This problem becomes more demanding for those newly installed solar plants which lack sufficient data. Current research predominantly relies on historical solar power data or numerical weather prediction in a single-modality format, ignoring the complementary information provided in different modalities. In this paper, we propose a multi-modality fusion framework to integrate historical power data, numerical weather prediction, and satellite images, significantly improving forecast performance. We introduce a vector quantized framework that aligns modalities with varying information densities, striking a balance between integrating sufficient information and averting model overfitting. Our framework demonstrates strong zero-shot forecasting capability, which is especially useful for those newly installed plants. Moreover, we collect and release a multi-modal solar power (MMSP) dataset from real-world plants to further promote the research of multi-modal solar forecasting algorithms. Our extensive experiments show that our model not only operates with robustness but also boosts accuracy in both zero-shot forecasting and scenarios rich with training data, surpassing leading models. We have incorporated it into our eForecaster platform and deployed it for more than 300 solar plants with a capacity of over 15GW.
