Table of Contents
Fetching ...

Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting

Mingyue Cheng, Jiahao Wang, Daoyu Wang, Xiaoyu Tao, Qi Liu, Enhong Chen

TL;DR

The paper investigates whether slow-thinking LLMs can perform time series forecasting by reframing TSF as a conditional reasoning task. It introduces TimeReasoner, a training-free framework that uses hybrid prompts, inference-time reasoning, and multiple reasoning strategies to produce forecasts and reasoning traces, aggregating results across generations. Empirical results across diverse datasets show competitive zero-shot performance, especially for complex temporal dynamics, and offer insights into prompt design, reasoning strategies, and uncertainty. The work highlights both the promise of reasoning-based forecasting and the need for principled uncertainty quantification and robust reasoning improvements for reliable deployment.

Abstract

Time series forecasting (TSF) is a fundamental and widely studied task, spanning methods from classical statistical approaches to modern deep learning and multimodal language modeling. Despite their effectiveness, these methods often follow a fast thinking paradigm emphasizing pattern extraction and direct value mapping, while overlooking explicit reasoning over temporal dynamics and contextual dependencies. Meanwhile, emerging slow-thinking LLMs (e.g., ChatGPT-o1, DeepSeek-R1) have demonstrated impressive multi-step reasoning capabilities across diverse domains, suggesting a new opportunity for reframing TSF as a structured reasoning task. This motivates a key question: can slow-thinking LLMs effectively reason over temporal patterns to support time series forecasting, even in zero-shot manner? To investigate this, in this paper, we propose TimeReasoner, an extensive empirical study that formulates TSF as a conditional reasoning task. We design a series of prompting strategies to elicit inference-time reasoning from pretrained slow-thinking LLMs and evaluate their performance across diverse TSF benchmarks. Our findings reveal that slow-thinking LLMs exhibit non-trivial zero-shot forecasting capabilities, especially in capturing high-level trends and contextual shifts. While preliminary, our study surfaces important insights into the reasoning behaviors of LLMs in temporal domains highlighting both their potential and limitations. We hope this work catalyzes further research into reasoning-based forecasting paradigms and paves the way toward more interpretable and generalizable TSF frameworks.

Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting

TL;DR

The paper investigates whether slow-thinking LLMs can perform time series forecasting by reframing TSF as a conditional reasoning task. It introduces TimeReasoner, a training-free framework that uses hybrid prompts, inference-time reasoning, and multiple reasoning strategies to produce forecasts and reasoning traces, aggregating results across generations. Empirical results across diverse datasets show competitive zero-shot performance, especially for complex temporal dynamics, and offer insights into prompt design, reasoning strategies, and uncertainty. The work highlights both the promise of reasoning-based forecasting and the need for principled uncertainty quantification and robust reasoning improvements for reliable deployment.

Abstract

Time series forecasting (TSF) is a fundamental and widely studied task, spanning methods from classical statistical approaches to modern deep learning and multimodal language modeling. Despite their effectiveness, these methods often follow a fast thinking paradigm emphasizing pattern extraction and direct value mapping, while overlooking explicit reasoning over temporal dynamics and contextual dependencies. Meanwhile, emerging slow-thinking LLMs (e.g., ChatGPT-o1, DeepSeek-R1) have demonstrated impressive multi-step reasoning capabilities across diverse domains, suggesting a new opportunity for reframing TSF as a structured reasoning task. This motivates a key question: can slow-thinking LLMs effectively reason over temporal patterns to support time series forecasting, even in zero-shot manner? To investigate this, in this paper, we propose TimeReasoner, an extensive empirical study that formulates TSF as a conditional reasoning task. We design a series of prompting strategies to elicit inference-time reasoning from pretrained slow-thinking LLMs and evaluate their performance across diverse TSF benchmarks. Our findings reveal that slow-thinking LLMs exhibit non-trivial zero-shot forecasting capabilities, especially in capturing high-level trends and contextual shifts. While preliminary, our study surfaces important insights into the reasoning behaviors of LLMs in temporal domains highlighting both their potential and limitations. We hope this work catalyzes further research into reasoning-based forecasting paradigms and paves the way toward more interpretable and generalizable TSF frameworks.

Paper Structure

This paper contains 35 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Overall framework of TimeReasoner for training-free reasoning-based time series forecasting.
  • Figure 2: Comparison of TimeReasoner’s forecasting capabilities under the three reasoning strategies with experiments conducted on ETTh2 (left) and Exchange (right) datasets.
  • Figure 3: Relationship between TimeReasoner's forecasting performance and lookback window and predict window.
  • Figure 4: The 80% confidence interval of prediction and average prediction results during 50 independent generations given the same input on ETTh1 dataset.
  • Figure 5: Standard deviation of predictions at each forecast step across ETTh1 and ETTh2 datasets.
  • ...and 5 more figures