Advancing Heat Demand Forecasting with Attention Mechanisms: Opportunities and Challenges
Adithya Ramachandran, Thorkil Flensmark B. Neergaard, Andreas Maier, Siming Bayer
TL;DR
The paper tackles accurate heat demand forecasting for District Heating Systems in the transition toward renewables and decarbonization. It introduces a deep learning architecture that uses time-frequency representations via Continuous Wavelet Transform and a cross-attention mechanism to fuse endogenous and exogenous drivers, forecasting up to $n=24$ steps ahead. Key contributions include a two-branch architecture with cross-attention, decomposition of features into $\text{trend}$, $\text{seasonal}$, and $\text{residual}$ components, and a major reduction in parameters (from ~155M to ~5.7M) while preserving accuracy, demonstrated on three Danish DMAs with hourly data from 2016–2020. The results show improved forecasting accuracy and robustness, with interpretable feature decomposition that supports practical deployment in sustainable district heating operations.
Abstract
Global leaders and policymakers are unified in their unequivocal commitment to decarbonization efforts in support of Net-Zero agreements. District Heating Systems (DHS), while contributing to carbon emissions due to the continued reliance on fossil fuels for heat production, are embracing more sustainable practices albeit with some sense of vulnerability as it could constrain their ability to adapt to dynamic demand and production scenarios. As demographic demands grow and renewables become the central strategy in decarbonizing the heating sector, the need for accurate demand forecasting has intensified. Advances in digitization have paved the way for Machine Learning (ML) based solutions to become the industry standard for modeling complex time series patterns. In this paper, we focus on building a Deep Learning (DL) model that uses deconstructed components of independent and dependent variables that affect heat demand as features to perform multi-step ahead forecasting of head demand. The model represents the input features in a time-frequency space and uses an attention mechanism to generate accurate forecasts. The proposed method is evaluated on a real-world dataset and the forecasting performance is assessed against LSTM and CNN-based forecasting models. Across different supply zones, the attention-based models outperforms the baselines quantitatively and qualitatively, with an Mean Absolute Error (MAE) of 0.105 with a standard deviation of 0.06kW h and a Mean Absolute Percentage Error (MAPE) of 5.4% with a standard deviation of 2.8%, in comparison the second best model with a MAE of 0.10 with a standard deviation of 0.06kW h and a MAPE of 5.6% with a standard deviation of 3%.
