Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

Yucong Luo; Yitong Zhou; Mingyue Cheng; Jiahao Wang; Daoyu Wang; Tingyue Pan; Jintao Zhang

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

Yucong Luo, Yitong Zhou, Mingyue Cheng, Jiahao Wang, Daoyu Wang, Tingyue Pan, Jintao Zhang

TL;DR

<3-5 sentence high-level summary> Time-R1 reframes time series forecasting as a slow-thinking, reasoning-driven task by training LLMs to generate stepwise temporal explanations before forecasting. It combines a supervised warmup with chain-of-thought data and a reinforcement-learning stage guided by fine-grained, multi-objective rewards, including a novel GRIP policy optimization with non-uniform sampling and adaptive trajectory weighting. The approach yields state-of-the-art or competitive accuracy across diverse real-world datasets, improves generalization (including zero-shot settings), and enhances interpretability through explicit reasoning traces. The authors also provide a structured training template and open-source implementation to broaden adoption in time-series applications.

Abstract

To advance time series forecasting (TSF), various methods have been proposed to improve prediction accuracy, evolving from statistical techniques to data-driven deep learning architectures. Despite their effectiveness, most existing methods still adhere to a fast thinking paradigm-relying on extracting historical patterns and mapping them to future values as their core modeling philosophy, lacking an explicit thinking process that incorporates intermediate time series reasoning. Meanwhile, emerging slow-thinking LLMs (e.g., OpenAI-o1) have shown remarkable multi-step reasoning capabilities, offering an alternative way to overcome these issues. However, prompt engineering alone presents several limitations - including high computational cost, privacy risks, and limited capacity for in-depth domain-specific time series reasoning. To address these limitations, a more promising approach is to train LLMs to develop slow thinking capabilities and acquire strong time series reasoning skills. For this purpose, we propose Time-R1, a two-stage reinforcement fine-tuning framework designed to enhance multi-step reasoning ability of LLMs for time series forecasting. Specifically, the first stage conducts supervised fine-tuning for warmup adaptation, while the second stage employs reinforcement learning to improve the model's generalization ability. Particularly, we design a fine-grained multi-objective reward specifically for time series forecasting, and then introduce GRIP (group-based relative importance for policy optimization), which leverages non-uniform sampling to further encourage and optimize the model's exploration of effective reasoning paths. Experiments demonstrate that Time-R1 significantly improves forecast performance across diverse datasets.

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

TL;DR

Abstract

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)