Table of Contents
Fetching ...

Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting

Luka Hobor, Mario Brcic, Lidija Polutnik, Ante Kapetanovic

TL;DR

While SAITS-based imputation improved neural network performance in aggregated settings, these models remained inferior to ensemble methods, and model selection should prioritize alignment with problem characteristics over architectural sophistication.

Abstract

Accurate demand forecasting is critical for brick-and-mortar retailers to optimize inventory management and minimize costs. This study evaluates statistical baselines, tree-based ensembles (XGBoost and LightGBM), and deep learning architectures (N-BEATS, N-HiTS, and the Temporal Fusion Transformer) on retail sales data characterized by intermittent demand, substantial missingness, and frequent product turnover. Models are compared across four configurations varying by aggregation level and imputation strategy, using evaluation protocols that reflect typical deployment patterns for each model class. Localized tree-based methods achieve superior performance, with XGBoost attaining the lowest RMSE of 4.833. While SAITS-based imputation improved neural network performance in aggregated settings, these models remained inferior to ensemble methods. The results suggest that, under the studied constraints, model selection should prioritize alignment with problem characteristics over architectural sophistication.

Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting

TL;DR

While SAITS-based imputation improved neural network performance in aggregated settings, these models remained inferior to ensemble methods, and model selection should prioritize alignment with problem characteristics over architectural sophistication.

Abstract

Accurate demand forecasting is critical for brick-and-mortar retailers to optimize inventory management and minimize costs. This study evaluates statistical baselines, tree-based ensembles (XGBoost and LightGBM), and deep learning architectures (N-BEATS, N-HiTS, and the Temporal Fusion Transformer) on retail sales data characterized by intermittent demand, substantial missingness, and frequent product turnover. Models are compared across four configurations varying by aggregation level and imputation strategy, using evaluation protocols that reflect typical deployment patterns for each model class. Localized tree-based methods achieve superior performance, with XGBoost attaining the lowest RMSE of 4.833. While SAITS-based imputation improved neural network performance in aggregated settings, these models remained inferior to ensemble methods. The results suggest that, under the studied constraints, model selection should prioritize alignment with problem characteristics over architectural sophistication.

Paper Structure

This paper contains 11 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Sales Values Real and Imputed
  • Figure 2: Example of Aggregated Predicted vs Actual Sales on a Single Group
  • Figure 3: Comparison of Training and Validation MSE for N-BEATS and TFT Models.