Table of Contents
Fetching ...

Robust Probabilistic Load Forecasting for a Single Household: A Comparative Study from SARIMA to Transformers on the REFIT Dataset

Midhun Manoj

TL;DR

This study addresses risk-aware, probabilistic short-term load forecasting for a single household using the REFIT dataset, focusing on preserving distributional structure in the presence of large data gaps. It systematically compares a hierarchy of models—from Seasonal Naïve and SARIMAX to LightGBM/XGBoost and probabilistic deep learning (LSTM and TFT)—after carefully imputing gaps with a Seasonal Imputer and engineering time-aware features. The results show classical models falter on nonlinear, regime-switching patterns, while TFT delivers the best overall performance with safe, adaptive prediction intervals, and LSTM provides strong probabilistic calibration. The findings highlight the practical value of combining robust gap handling with advanced probabilistic forecasting for reliable energy management, and point to scalable extensions across multiple homes and NILM applications.

Abstract

Probabilistic forecasting is essential for modern risk management, allowing decision-makers to quantify uncertainty in critical systems. This paper tackles this challenge using the volatile REFIT household dataset, which is complicated by a large structural data gap. We first address this by conducting a rigorous comparative experiment to select a Seasonal Imputation method, demonstrating its superiority over linear interpolation in preserving the data's underlying distribution. We then systematically evaluate a hierarchy of models, progressing from classical baselines (SARIMA, Prophet) to machine learning (XGBoost) and advanced deep learning architectures (LSTM). Our findings reveal that classical models fail to capture the data's non-linear, regime-switching behavior. While the LSTM provided the most well-calibrated probabilistic forecast, the Temporal Fusion Transformer (TFT) emerged as the superior all-round model, achieving the best point forecast accuracy (RMSE 481.94) and producing safer, more cautious prediction intervals that effectively capture extreme volatility.

Robust Probabilistic Load Forecasting for a Single Household: A Comparative Study from SARIMA to Transformers on the REFIT Dataset

TL;DR

This study addresses risk-aware, probabilistic short-term load forecasting for a single household using the REFIT dataset, focusing on preserving distributional structure in the presence of large data gaps. It systematically compares a hierarchy of models—from Seasonal Naïve and SARIMAX to LightGBM/XGBoost and probabilistic deep learning (LSTM and TFT)—after carefully imputing gaps with a Seasonal Imputer and engineering time-aware features. The results show classical models falter on nonlinear, regime-switching patterns, while TFT delivers the best overall performance with safe, adaptive prediction intervals, and LSTM provides strong probabilistic calibration. The findings highlight the practical value of combining robust gap handling with advanced probabilistic forecasting for reliable energy management, and point to scalable extensions across multiple homes and NILM applications.

Abstract

Probabilistic forecasting is essential for modern risk management, allowing decision-makers to quantify uncertainty in critical systems. This paper tackles this challenge using the volatile REFIT household dataset, which is complicated by a large structural data gap. We first address this by conducting a rigorous comparative experiment to select a Seasonal Imputation method, demonstrating its superiority over linear interpolation in preserving the data's underlying distribution. We then systematically evaluate a hierarchy of models, progressing from classical baselines (SARIMA, Prophet) to machine learning (XGBoost) and advanced deep learning architectures (LSTM). Our findings reveal that classical models fail to capture the data's non-linear, regime-switching behavior. While the LSTM provided the most well-calibrated probabilistic forecast, the Temporal Fusion Transformer (TFT) emerged as the superior all-round model, achieving the best point forecast accuracy (RMSE 481.94) and producing safer, more cautious prediction intervals that effectively capture extreme volatility.

Paper Structure

This paper contains 27 sections, 2 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Yearly average load, highlighting the structural data gap from late 2014 to early 2015. Notice the gap in data
  • Figure 2: Distribution of Imputed Values vs. Ground Truth. The Seasonal Imputer (purple) successfully preserves the primary peak of the Ground Truth (black), while the Linear Imputer (orange) fails to replicate the bimodal structure.
  • Figure 3: LSTM cell with its gates
  • Figure 4: TFT architecture
  • Figure 5: Seasonal Naïve forecast. It captures the weekend drop but is misaligned with the weekday peaks.
  • ...and 5 more figures