Table of Contents
Fetching ...

Large Scale Hierarchical Industrial Demand Time-Series Forecasting incorporating Sparsity

Harshavardhan Kamarthi, Aditya B. Sasanur, Xinjie Tong, Xingyu Zhou, James Peters, Joe Czyzyk, B. Aditya Prakash

TL;DR

This paper tackles large-scale hierarchical demand time-series forecasting with varying sparsity across levels. It introduces HAILS, which adaptively models sparse and dense time-series with different distributions (Poisson for sparse, Gaussian for dense) and uses a Distributional Consistency Regularization with Sparse adaptation (DCRS) to reconcile forecasts across the hierarchy. Empirical results on M5 and Dow show that HAILS improves forecast accuracy and calibration, including substantial gains in sparse layers, while providing notable training efficiency improvements. The work demonstrates practical deployment in an industrial setting and discusses deployment considerations such as data quality, explainability, and dynamic hierarchies.

Abstract

Hierarchical time-series forecasting (HTSF) is an important problem for many real-world business applications where the goal is to simultaneously forecast multiple time-series that are related to each other via a hierarchical relation. Recent works, however, do not address two important challenges that are typically observed in many demand forecasting applications at large companies. First, many time-series at lower levels of the hierarchy have high sparsity i.e., they have a significant number of zeros. Most HTSF methods do not address this varying sparsity across the hierarchy. Further, they do not scale well to the large size of the real-world hierarchy typically unseen in benchmarks used in literature. We resolve both these challenges by proposing HAILS, a novel probabilistic hierarchical model that enables accurate and calibrated probabilistic forecasts across the hierarchy by adaptively modeling sparse and dense time-series with different distributional assumptions and reconciling them to adhere to hierarchical constraints. We show the scalability and effectiveness of our methods by evaluating them against real-world demand forecasting datasets. We deploy HAILS at a large chemical manufacturing company for a product demand forecasting application with over ten thousand products and observe a significant 8.5\% improvement in forecast accuracy and 23% better improvement for sparse time-series. The enhanced accuracy and scalability make HAILS a valuable tool for improved business planning and customer experience.

Large Scale Hierarchical Industrial Demand Time-Series Forecasting incorporating Sparsity

TL;DR

This paper tackles large-scale hierarchical demand time-series forecasting with varying sparsity across levels. It introduces HAILS, which adaptively models sparse and dense time-series with different distributions (Poisson for sparse, Gaussian for dense) and uses a Distributional Consistency Regularization with Sparse adaptation (DCRS) to reconcile forecasts across the hierarchy. Empirical results on M5 and Dow show that HAILS improves forecast accuracy and calibration, including substantial gains in sparse layers, while providing notable training efficiency improvements. The work demonstrates practical deployment in an industrial setting and discusses deployment considerations such as data quality, explainability, and dynamic hierarchies.

Abstract

Hierarchical time-series forecasting (HTSF) is an important problem for many real-world business applications where the goal is to simultaneously forecast multiple time-series that are related to each other via a hierarchical relation. Recent works, however, do not address two important challenges that are typically observed in many demand forecasting applications at large companies. First, many time-series at lower levels of the hierarchy have high sparsity i.e., they have a significant number of zeros. Most HTSF methods do not address this varying sparsity across the hierarchy. Further, they do not scale well to the large size of the real-world hierarchy typically unseen in benchmarks used in literature. We resolve both these challenges by proposing HAILS, a novel probabilistic hierarchical model that enables accurate and calibrated probabilistic forecasts across the hierarchy by adaptively modeling sparse and dense time-series with different distributional assumptions and reconciling them to adhere to hierarchical constraints. We show the scalability and effectiveness of our methods by evaluating them against real-world demand forecasting datasets. We deploy HAILS at a large chemical manufacturing company for a product demand forecasting application with over ten thousand products and observe a significant 8.5\% improvement in forecast accuracy and 23% better improvement for sparse time-series. The enhanced accuracy and scalability make HAILS a valuable tool for improved business planning and customer experience.
Paper Structure (21 sections, 1 theorem, 9 equations, 7 figures, 5 tables)

This paper contains 21 sections, 1 theorem, 9 equations, 7 figures, 5 tables.

Key Result

Theorem 1

Let $X_1, X_2, \dots, X_N$ be $N$ independent Poisson random variables with parameters $\lambda_1, \lambda_2, \dots, \lambda_N$. Then denote $Y$ as $Y=\sum_{i=1}^N X_i$. Then $Y$ is a Poisson variable with parameter $\lambda_Y = \sum_{i=1}\lambda_i$. Then for sufficiently large $\lambda_Y$, $Y$ can

Figures (7)

  • Figure 1: Overview of pipeline of HAILS. (a) The lower levels of the hierarchy tend to have sparse (red) time-series while the higher levels have denser (blue) time-series. (b) HAILS first generates forecasts for each of the time-series of hierarchy with their parametric form depending on the sparsity of the time-series. The denser time-series forecasts are modeled as Gaussians and sparser ones as Poisson. (c) The distributions are reconciled via a distribution consistency loss for each subtree. If the subtree has all distribution same the appropriate loss is applied. In case of mixed subtrees, Poisson distribution of children are first approximated as Gaussian.
  • Figure 2: For homogeneous subtrees of Normal and Poisson distribution (a,c), JSD divergence loss is applied directly. In case of heterogeneous subtrees (b), Poisson distributions are first approximated as gaussians and then JSD is applied across resultant gaussians.
  • Figure 3: HAILS has significantly lower WRMSSE than Dow baseline across all levels of the hierarchy.
  • Figure 4: RMSE of HAILS is consistently lower than Dow baseline across the forecast horizon.
  • Figure 5: HAILS provides an average of 44.14% improvement over Dow Model over top 7 industries and countries.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Definition 1
  • Theorem 1