Table of Contents
Fetching ...

LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics

Jialiang Tang, Shuo Chen, Chen Gong, Jing Zhang, Dacheng Tao

TL;DR

This work tackles the gap in time series forecasting by adapting large language models to temporal data through explicit learning of Patterns and Semantics. It introduces MSCNN to capture short-term fluctuations and long-term trends and a Time-to-Text module to extract meaningful semantics from time-series patches, with a LoRA-based efficient training regime. The integrated framework enables the LLM to better model temporal dependencies, achieving state-of-the-art results across long- and short-term horizons, as well as in few-shot and zero-shot settings, while maintaining robustness to noise. Overall, LLM-PS demonstrates that combining temporal-pattern decoupling and semantic extraction with LLMs yields substantial gains for practical TSF across finance, energy, transportation, and healthcare domains.

Abstract

Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Recent studies have revealed that Large Language Models (LLMs), with their powerful in-contextual modeling capabilities, hold significant potential for TSF. However, existing LLM-based methods usually perform suboptimally because they neglect the inherent characteristics of time series data. Unlike the textual data used in LLM pre-training, the time series data is semantically sparse and comprises distinctive temporal patterns. To address this problem, we propose LLM-PS to empower the LLM for TSF by learning the fundamental \textit{Patterns} and meaningful \textit{Semantics} from time series data. Our LLM-PS incorporates a new multi-scale convolutional neural network adept at capturing both short-term fluctuations and long-term trends within the time series. Meanwhile, we introduce a time-to-text module for extracting valuable semantics across continuous time intervals rather than isolated time points. By integrating these patterns and semantics, LLM-PS effectively models temporal dependencies, enabling a deep comprehension of time series and delivering accurate forecasts. Intensive experimental results demonstrate that LLM-PS achieves state-of-the-art performance in both short- and long-term forecasting tasks, as well as in few- and zero-shot settings.

LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics

TL;DR

This work tackles the gap in time series forecasting by adapting large language models to temporal data through explicit learning of Patterns and Semantics. It introduces MSCNN to capture short-term fluctuations and long-term trends and a Time-to-Text module to extract meaningful semantics from time-series patches, with a LoRA-based efficient training regime. The integrated framework enables the LLM to better model temporal dependencies, achieving state-of-the-art results across long- and short-term horizons, as well as in few-shot and zero-shot settings, while maintaining robustness to noise. Overall, LLM-PS demonstrates that combining temporal-pattern decoupling and semantic extraction with LLMs yields substantial gains for practical TSF across finance, energy, transportation, and healthcare domains.

Abstract

Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Recent studies have revealed that Large Language Models (LLMs), with their powerful in-contextual modeling capabilities, hold significant potential for TSF. However, existing LLM-based methods usually perform suboptimally because they neglect the inherent characteristics of time series data. Unlike the textual data used in LLM pre-training, the time series data is semantically sparse and comprises distinctive temporal patterns. To address this problem, we propose LLM-PS to empower the LLM for TSF by learning the fundamental \textit{Patterns} and meaningful \textit{Semantics} from time series data. Our LLM-PS incorporates a new multi-scale convolutional neural network adept at capturing both short-term fluctuations and long-term trends within the time series. Meanwhile, we introduce a time-to-text module for extracting valuable semantics across continuous time intervals rather than isolated time points. By integrating these patterns and semantics, LLM-PS effectively models temporal dependencies, enabling a deep comprehension of time series and delivering accurate forecasts. Intensive experimental results demonstrate that LLM-PS achieves state-of-the-art performance in both short- and long-term forecasting tasks, as well as in few- and zero-shot settings.

Paper Structure

This paper contains 25 sections, 21 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Performance of our proposed LLM-PS, LLM-based methods liu2024tamingjintime, and conventional deep learning methods wu2023timesnetzeng2023dlinearliu2023itransformer.
  • Figure 2: An overview of our proposed LLM-PS. Our LLM-PS incorporates a Multi-Scale Convolutional Neural Network (MSCNN) and Time-to-Text (T2T) semantics extractor. Specifically, for input time series $\mathbf{Y}$, MSCNN constructs multi-scale features $\mathbf{F}_{\text{MS}}$ with various receptive fields (darker colors indicate larger receptive fields), thereby capturing localized short-term fluctuations and broader long-term trends. T2T extracts features $\mathbf{F}_{\text{T2T}}$ with meaningful semantics to promote the LLM to precisely understand the input time series. Finally, the diverse temporal patterns and rich semantics are integrated via feature transferring and input into the LLM to generate precise time series $\hat{\mathbf{Y}}$.
  • Figure 3: The diagram of our MSCNN block. The divided features are initially fed into their related 3$\times$3 convolutional layers to obtain features (e.g., $\bar{\mathbf{F}}_{1}$) with various receptive fields. Then, these features are decoupled into long-term patterns (e.g., $\mathbf{P}^{1}_{\text{L}}$) and short-term patterns (e.g., $\mathbf{P}^{1}_{\text{S}}$) using the Wavelet Transform (WT) and Inverse Wavelet Transform (IWT). Subsequently, the long-term and short-term patterns are enhanced through global-to-local and local-to-global assembling, respectively. Finally, the improved patterns are added together and passed through a 1$\times$1 convolutional layer to obtain the multi-scale features.
  • Figure 4: Analysis of (a) multi-scale feature extraction and (b) temporal patterns decoupling. Subfigures (c) and (d) show the MSE/MAE of various methods on noisy ETTh1 datasets. Notably, lower MSE/MAE indicates better model performance.
  • Figure 5: Visualization of the time series belongs to the Weather dataset and the finance subset in the M4 dataset. The temperature readings measured by the meteorological station are generally stable, whereas stock prices in financial markets fluctuate rapidly around the average value.
  • ...and 3 more figures