Table of Contents
Fetching ...

LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data

Hanyu Zhang, Chuck Arvin, Dmitry Efimov, Michael W. Mahoney, Dominique Perrault-Joncas, Shankar Ramasubramanian, Andrew Gordon Wilson, Malcolm Wolff

TL;DR

Problem: Traditional demand forecasting often underutilizes unstructured textual information about products, limiting anticipation of holiday-driven surges. Approach: Introduces LLMForecaster as a forecast post-processor that, given base forecast $f_{i,t}$, text features and numeric features, outputs $f^*_{i,t} = e^{\hat{\lambda}_{i,t}} f_{i,t}$ by learning $\hat{\lambda}_{i,t}$ via a fine-tuned LLM with LoRA. Contributions/findings: In an industry-scale retail setting, the method yields statistically significant improvements in forecast accuracy across multiple holidays, particularly when using the Holiday-Encoding Prompt. Impact: Enables better inventory planning and reduces stockouts during seasonal events by augmenting existing forecasting pipelines with unstructured data.

Abstract

Modern time-series forecasting models often fail to make full use of rich unstructured information about the time series themselves. This lack of proper conditioning can lead to obvious model failures; for example, models may be unaware of the details of a particular product, and hence fail to anticipate seasonal surges in customer demand in the lead up to major exogenous events like holidays for clearly relevant products. To address this shortcoming, this paper introduces a novel forecast post-processor -- which we call LLMForecaster -- that fine-tunes large language models (LLMs) to incorporate unstructured semantic and contextual information and historical data to improve the forecasts from an existing demand forecasting pipeline. In an industry-scale retail application, we demonstrate that our technique yields statistically significantly forecast improvements across several sets of products subject to holiday-driven demand surges.

LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data

TL;DR

Problem: Traditional demand forecasting often underutilizes unstructured textual information about products, limiting anticipation of holiday-driven surges. Approach: Introduces LLMForecaster as a forecast post-processor that, given base forecast , text features and numeric features, outputs by learning via a fine-tuned LLM with LoRA. Contributions/findings: In an industry-scale retail setting, the method yields statistically significant improvements in forecast accuracy across multiple holidays, particularly when using the Holiday-Encoding Prompt. Impact: Enables better inventory planning and reduces stockouts during seasonal events by augmenting existing forecasting pipelines with unstructured data.

Abstract

Modern time-series forecasting models often fail to make full use of rich unstructured information about the time series themselves. This lack of proper conditioning can lead to obvious model failures; for example, models may be unaware of the details of a particular product, and hence fail to anticipate seasonal surges in customer demand in the lead up to major exogenous events like holidays for clearly relevant products. To address this shortcoming, this paper introduces a novel forecast post-processor -- which we call LLMForecaster -- that fine-tunes large language models (LLMs) to incorporate unstructured semantic and contextual information and historical data to improve the forecasts from an existing demand forecasting pipeline. In an industry-scale retail application, we demonstrate that our technique yields statistically significantly forecast improvements across several sets of products subject to holiday-driven demand surges.

Paper Structure

This paper contains 9 sections, 2 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Aggregated demand and forecast for groups of products: (i) Mother's Day products; and (ii) Easter products. The vertical dashed lines mark the week prior to the holiday in question. In green, we show time segments where the production model anticipates event-driven demand surges --- specifically large shopping events like Christmas. In red, we show time segments where the production model fails to anticipate event-driven demand surges.
  • Figure 2: LLMForecaster incorporates text and numeric information through an LLM to rescale the raw prediction $f_{i,t}$, producing a better forecast $f^*_{i,t}$.
  • Figure 3: Example of aggregated forecasts on Easter products.
  • Figure 4: Total demand and prediction for Easter products with and without Holiday-Encoding Prompt
  • Figure 5: Forecast accuracy change for Halloween products
  • ...and 5 more figures