Table of Contents
Fetching ...

Unveiling the Potential of Text in High-Dimensional Time Series Forecasting

Xin Zhou, Weiqing Wang, Shilin Qu, Zhiqiang Zhang, Christoph Bergmeir

TL;DR

The paper tackles high-dimensional time series forecasting by incorporating textual context through a multimodal TextFusionHTS framework. It combines PatchTST-based time-series representations with text embeddings from a Large Language Model using a cross-attention fusion module, producing forecasts $y$ for horizon $h$. Empirical results on Wiki-People and News datasets show consistent improvements in MAE and WAPE when text is included, with analysis of text feature extraction strategies revealing context-dependent benefits. The work demonstrates the potential of multimodal inputs to boost predictive accuracy in complex time-series settings and outlines directions for adding further modalities and broader evaluations.

Abstract

Time series forecasting has traditionally focused on univariate and multivariate numerical data, often overlooking the benefits of incorporating multimodal information, particularly textual data. In this paper, we propose a novel framework that integrates time series models with Large Language Models to improve high-dimensional time series forecasting. Inspired by multimodal models, our method combines time series and textual data in the dual-tower structure. This fusion of information creates a comprehensive representation, which is then processed through a linear layer to generate the final forecast. Extensive experiments demonstrate that incorporating text enhances high-dimensional time series forecasting performance. This work paves the way for further research in multimodal time series forecasting.

Unveiling the Potential of Text in High-Dimensional Time Series Forecasting

TL;DR

The paper tackles high-dimensional time series forecasting by incorporating textual context through a multimodal TextFusionHTS framework. It combines PatchTST-based time-series representations with text embeddings from a Large Language Model using a cross-attention fusion module, producing forecasts for horizon . Empirical results on Wiki-People and News datasets show consistent improvements in MAE and WAPE when text is included, with analysis of text feature extraction strategies revealing context-dependent benefits. The work demonstrates the potential of multimodal inputs to boost predictive accuracy in complex time-series settings and outlines directions for adding further modalities and broader evaluations.

Abstract

Time series forecasting has traditionally focused on univariate and multivariate numerical data, often overlooking the benefits of incorporating multimodal information, particularly textual data. In this paper, we propose a novel framework that integrates time series models with Large Language Models to improve high-dimensional time series forecasting. Inspired by multimodal models, our method combines time series and textual data in the dual-tower structure. This fusion of information creates a comprehensive representation, which is then processed through a linear layer to generate the final forecast. Extensive experiments demonstrate that incorporating text enhances high-dimensional time series forecasting performance. This work paves the way for further research in multimodal time series forecasting.
Paper Structure (13 sections, 2 equations, 2 figures, 1 table)

This paper contains 13 sections, 2 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Dual-tower structure of TextFusionHTS.
  • Figure 2: Comparison of different strategies for text feature extraction. Here the average of all token embeddings, [bos] token embedding, and [cls] token embeddings are represented with purple, blue, and grey bars separately.