Table of Contents
Fetching ...

Analyzing Uncertainty Quantification in Statistical and Deep Learning Models for Probabilistic Electricity Price Forecasting

Andreas Lebedev, Abhinav Das, Sven Pappert, Stephan Schlüter

TL;DR

This study compares probabilistic electricity price forecasting methods that account for data and model uncertainty in the German day-ahead market using 2018–2024 data. It contrasts distributional deep nets augmented with ensembles, MC dropout, and conformal prediction against LEAR-based models with QRA, GARCH, and conformal prediction, alongside naive benchmarks. Across multiple metrics, LEAR variants emerge as strong, well-calibrated baselines, while DDNN, ensemble, and dropout methods improve probabilistic forecasts and stability; deep evidential regression underperforms in interval reliability. The results show ensemble- and CP-enhanced models can improve interval quality, but there is no universal winner across all point and probabilistic metrics, and trading performance depends on the chosen calibration and risk considerations. The work highlights the value of combining simple, interpretable statistical models with modern uncertainty-quantification techniques for risk-aware electricity price forecasting and decision-making.

Abstract

Precise probabilistic forecasts are fundamental for energy risk management, and there is a wide range of both statistical and machine learning models for this purpose. Inherent to these probabilistic models is some form of uncertainty quantification. However, most models do not capture the full extent of uncertainty, which arises not only from the data itself but also from model and distributional choices. In this study, we examine uncertainty quantification in state-of-the-art statistical and deep learning probabilistic forecasting models for electricity price forecasting in the German market. In particular, we consider deep distributional neural networks (DDNNs) and augment them with an ensemble approach, Monte Carlo (MC) dropout, and conformal prediction to account for model uncertainty. Additionally, we consider the LASSO-estimated autoregressive (LEAR) approach combined with quantile regression averaging (QRA), generalized autoregressive conditional heteroskedasticity (GARCH), and conformal prediction. Across a range of performance metrics, we find that the LEAR-based models perform well in terms of probabilistic forecasting, irrespective of the uncertainty quantification method. Furthermore, we find that DDNNs benefit from incorporating both data and model uncertainty, improving both point and probabilistic forecasting. Uncertainty itself appears to be best captured by the models using conformal prediction. Overall, our extensive study shows that all models under consideration perform competitively. However, their relative performance depends on the choice of metrics for point and probabilistic forecasting.

Analyzing Uncertainty Quantification in Statistical and Deep Learning Models for Probabilistic Electricity Price Forecasting

TL;DR

This study compares probabilistic electricity price forecasting methods that account for data and model uncertainty in the German day-ahead market using 2018–2024 data. It contrasts distributional deep nets augmented with ensembles, MC dropout, and conformal prediction against LEAR-based models with QRA, GARCH, and conformal prediction, alongside naive benchmarks. Across multiple metrics, LEAR variants emerge as strong, well-calibrated baselines, while DDNN, ensemble, and dropout methods improve probabilistic forecasts and stability; deep evidential regression underperforms in interval reliability. The results show ensemble- and CP-enhanced models can improve interval quality, but there is no universal winner across all point and probabilistic metrics, and trading performance depends on the chosen calibration and risk considerations. The work highlights the value of combining simple, interpretable statistical models with modern uncertainty-quantification techniques for risk-aware electricity price forecasting and decision-making.

Abstract

Precise probabilistic forecasts are fundamental for energy risk management, and there is a wide range of both statistical and machine learning models for this purpose. Inherent to these probabilistic models is some form of uncertainty quantification. However, most models do not capture the full extent of uncertainty, which arises not only from the data itself but also from model and distributional choices. In this study, we examine uncertainty quantification in state-of-the-art statistical and deep learning probabilistic forecasting models for electricity price forecasting in the German market. In particular, we consider deep distributional neural networks (DDNNs) and augment them with an ensemble approach, Monte Carlo (MC) dropout, and conformal prediction to account for model uncertainty. Additionally, we consider the LASSO-estimated autoregressive (LEAR) approach combined with quantile regression averaging (QRA), generalized autoregressive conditional heteroskedasticity (GARCH), and conformal prediction. Across a range of performance metrics, we find that the LEAR-based models perform well in terms of probabilistic forecasting, irrespective of the uncertainty quantification method. Furthermore, we find that DDNNs benefit from incorporating both data and model uncertainty, improving both point and probabilistic forecasting. Uncertainty itself appears to be best captured by the models using conformal prediction. Overall, our extensive study shows that all models under consideration perform competitively. However, their relative performance depends on the choice of metrics for point and probabilistic forecasting.

Paper Structure

This paper contains 22 sections, 23 equations, 13 figures, 9 tables.

Figures (13)

  • Figure 1: Day-ahead electricity prices from 12.10.2018 to 30.11.2024.
  • Figure 2: Load and renewable energy (wind + PV) forecasts from 12.10.2018 to 30.11.2024.
  • Figure 3: Load and renewable forecasts against day-ahead electricity prices from 12.10.2018 to 30.11.2024.
  • Figure 4: Models are evaluated using PICP across confidence intervals ranging from 2% to 98%. The top subplot shows the average results over 10 independent runs, with standard deviations indicated by dotted lines. The bottom subplot displays the difference in performance between the DDNN and the other models.
  • Figure 5: Models are evaluated using MPIW across confidence intervals ranging from 2% to 98%. The top subplot shows the average results over 10 independent runs, with standard deviations indicated by dotted lines. The bottom subplot displays the difference in performance between the DDNN and the other models.
  • ...and 8 more figures