RATSF: Empowering Customer Service Volume Management through Retrieval-Augmented Time-Series Forecasting

Tianfeng Wang; Gaojie Cui

RATSF: Empowering Customer Service Volume Management through Retrieval-Augmented Time-Series Forecasting

Tianfeng Wang, Gaojie Cui

TL;DR

RATSF tackles non-stationary univariate time-series forecasting by retrieving and leveraging historically similar sequences. It couples a Time Series Knowledge Base (TSKB) with a Retrieval Augmented Cross-Attention (RACA) module to inject relevant history into Transformer forecasts. The framework achieves substantial error reductions across real-world service-volume data and public benchmarks, is adaptable to various Transformer variants, and offers practical staffing-cost benefits. This work demonstrates the value of retrieval-augmented time-series forecasting for industrial applications with long historical contexts.

Abstract

An efficient customer service management system hinges on precise forecasting of service volume. In this scenario, where data non-stationarity is pronounced, successful forecasting heavily relies on identifying and leveraging similar historical data rather than merely summarizing periodic patterns. Existing models based on RNN or Transformer architectures may struggle with this flexible and effective utilization. To tackle this challenge, we initially developed the Time Series Knowledge Base (TSKB) with an advanced indexing system for efficient historical data retrieval. We also developed the Retrieval Augmented Cross-Attention (RACA) module, a variant of the cross-attention mechanism within Transformer's decoder layers, designed to be seamlessly integrated into the vanilla Transformer architecture to assimilate key historical data segments. The synergy between TSKB and RACA forms the backbone of our Retrieval-Augmented Time Series Forecasting (RATSF) framework. Based on the above two components, RATSF not only significantly enhances performance in the context of Fliggy hotel service volume forecasting but also adapts flexibly to various scenarios and integrates with a multitude of Transformer variants for time-series forecasting. Extensive experimentation has validated the effectiveness and generalizability of this system design across multiple diverse contexts.

RATSF: Empowering Customer Service Volume Management through Retrieval-Augmented Time-Series Forecasting

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 5 figures, 6 tables)

This paper contains 26 sections, 1 equation, 5 figures, 6 tables.

Introduction
Review
Method
Setting & Notations
Overview of RATSF
TSKB
Sequential Slicing
Embedding Learning
RACA
Deployments Procedure
TSKB Initialization
Identifying Optimal $L_v$
Adjusting Retrieval Length $L_r$
TSKB and Model Optimization
Experiments
...and 11 more sections

Figures (5)

Figure 1: RATSF uses a TSKB to store and index historical sequences, alongside a transformer-based forecasting model. As illustrated in the bottom-right corner, TSKB segments the full history for efficient retrieval. To do forecasting, the Encoder fetches $\mathbf{X}_o$ from t recent time steps, and forms a retrieval sequence with d latest steps, retrieves N related sequences $\mathbf{X}_r$ from the TSKB. RACA in the Decoder then merges $\mathbf{X}_r$ and $\mathbf{X}_o$ info to deliver result.
Figure 2: TSKB utilizes a rolling window of length $L_v$ to collect V, with an indexing segment of length $L_r$ taken from its leading part as K , and advances the window in steps of size S.
Figure 3: Within the cold start stage, we use raw sequence of retrieval input to find Top-K relevant sequences using DTW. After one epoch, we switch to Euclidean Distance for retrieval embedding to gather Top-K matches.
Figure 4: Each RACA has two cross-attention modules: one utilizes the Encoder output as K,V, while the other employs embedded retrieved sequences as K,V. Both outputs are concatenated and passed through a Linear module to reshape back to the input dimensions.
Figure 5: This figure shows RACA's use of retrieved sequences for a random forecast sequence $\mathbf{X_f^*}$.The lower half features the top 3 similar sequences of $\mathbf{X_f^*}$, concatenated along the time axis to form RACA's input. Attention weights at the first forecast point ($x_{t+1}$) are marked by data bars, with yellow and green indicating RACA's focus on downward-curving segments. The upper half replicates prediction results (yellow "pred" lines for $x_{t+1}...x_{t+f}$) and true values (orange "gt" lines), aligned with the corresponding final $L_f$ lengths of the retrieved sequences. This visual comparison highlights the model's pattern recognition, with RACA's focus areas correlating to the predictions' inflection at $x_{t+1}$.For this sample, the model notably focuses on segments indicating ascent and predicts a subsequent increase in value.

RATSF: Empowering Customer Service Volume Management through Retrieval-Augmented Time-Series Forecasting

TL;DR

Abstract

RATSF: Empowering Customer Service Volume Management through Retrieval-Augmented Time-Series Forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (5)