Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

Hua Tang; Chong Zhang; Mingyu Jin; Qinkai Yu; Zhenting Wang; Xiaobo Jin; Yongfeng Zhang; Mengnan Du

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

Hua Tang, Chong Zhang, Mingyu Jin, Qinkai Yu, Zhenting Wang, Xiaobo Jin, Yongfeng Zhang, Mengnan Du

TL;DR

This paper systematically investigates how large language models (LLMs) perform in zero-shot time series forecasting and what input factors influence their success. By comparing LLMs to traditional methods and probing with real and synthetic datasets, it reveals that LLMs favor data with strong trend or seasonal patterns and struggle with multi-period series. It then proposes practical prompt-engineering techniques—injecting external dataset knowledge and converting numerical sequences into natural language—to substantially improve forecasting accuracy without fine-tuning. The findings illuminate why LLMs excel in certain temporal patterns and provide actionable guidance for leveraging LLMs in time series tasks, including cost considerations and leakage checks. Overall, the work offers a benchmark and a set of effective, low-overhead strategies for enhancing LLM-based time series forecasting.

Abstract

Large language models (LLMs) have been applied in many fields and have developed rapidly in recent years. As a classic machine learning task, time series forecasting has recently been boosted by LLMs. Recent works treat large language models as \emph{zero-shot} time series reasoners without further fine-tuning, which achieves remarkable performance. However, there are some unexplored research problems when applying LLMs for time series forecasting under the zero-shot setting. For instance, the LLMs' preferences for the input time series are less understood. In this paper, by comparing LLMs with traditional time series forecasting models, we observe many interesting properties of LLMs in the context of time series forecasting. First, our study shows that LLMs perform well in predicting time series with clear patterns and trends, but face challenges with datasets lacking periodicity. This observation can be explained by the ability of LLMs to recognize the underlying period within datasets, which is supported by our experiments. In addition, the input strategy is investigated, and it is found that incorporating external knowledge and adopting natural language paraphrases substantially improve the predictive performance of LLMs for time series. Overall, our study contributes insight into LLMs' advantages and limitations in time series forecasting under different conditions.

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

TL;DR

Abstract

Paper Structure (26 sections, 3 equations, 3 figures, 8 tables)

This paper contains 26 sections, 3 equations, 3 figures, 8 tables.

Introduction
Preliminaries
Large Language Model
Time Series Forecasting
What are LLMs' Preferences in Time Series Forecasting?
Analyzing Method
Preferences for Input Sequences
Implementation Details
Key Findings
Why do LLMs Forecast Well on Data with Higher Seasonal Strengths?
Implementation Details
Key Findings
How to Leverage These Insights to Improve the Model's Performance?
Dataset description and the External Knowledge incorporated in the Prompts
External Knowledge Enhancing Time Series Forecasting
...and 11 more sections

Figures (3)

Figure 1: The workflow of our analysis process. Our analysis workflow involves processing sequence data using different tokenization and embedding methods with various LLMs, such as GPTs and Gemini. To analyze the preferences of LLMs, we compute the seasonal and trend strength inside the datasets. Our experiments illuminate that LLMs prefer series with higher seasonal and trend strengths. To elucidate the rationale behind our findings, we demand the LLMs identify the underlying periods, revealing that the model can recognize the underlying periods in most cases. In addition, to improve the performance of time series forecasting, we propose two approaches to the user input: for the input prompt, we incorporate human knowledge regarding the dataset sources, and for the input sequence, we reprogram the data into natural language sequences. Both methods result in substantially improved model performance.
Figure 2: Experiments of Sequence Focused Attention Through Counterfactual Explanation on GPT-3.5-turbo.
Figure 3: Experiments of Sequence Focused Attention Through Counterfactual Explanation on Gemini-Pro-1.0.

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

TL;DR

Abstract

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

Authors

TL;DR

Abstract

Table of Contents

Figures (3)