Empowering Time Series Analysis with Large Language Models: A Survey
Yushan Jiang, Zijie Pan, Xikun Zhang, Sahil Garg, Anderson Schneider, Yuriy Nevmyvaka, Dongjin Song
TL;DR
Empowering Time Series Analysis with Large Language Models provides a comprehensive survey of methods that adapt pre-trained LLMs for time series tasks, addressing non-stationarity and data heterogeneity that hinder scratch training. It introduces a five-group taxonomy (direct query, tokenization, prompt design, fine-tuning, and model integration) and catalogs 21 representative works since 2022 across universal and spatial-temporal domains, including finance, healthcare, traffic, and computer vision. The paper elaborates domain-specific applications, design choices, and performance patterns, and discusses challenges such as interpretability, multi-modality, domain generalization, scaling laws, and safety. By outlining concrete opportunities for tokenization, prompting, and integration strategies, it guides researchers and practitioners in leveraging LLMs as flexible components or as part of larger time series pipelines. The work has practical impact by clarifying how LLMs can augment time series analysis with reasoning, external knowledge, and natural language interfaces across diverse domains.
Abstract
Recently, remarkable progress has been made over large language models (LLMs), demonstrating their unprecedented capability in varieties of natural language tasks. However, completely training a large general-purpose model from the scratch is challenging for time series analysis, due to the large volumes and varieties of time series data, as well as the non-stationarity that leads to concept drift impeding continuous model adaptation and re-training. Recent advances have shown that pre-trained LLMs can be exploited to capture complex dependencies in time series data and facilitate various applications. In this survey, we provide a systematic overview of existing methods that leverage LLMs for time series analysis. Specifically, we first state the challenges and motivations of applying language models in the context of time series as well as brief preliminaries of LLMs. Next, we summarize the general pipeline for LLM-based time series analysis, categorize existing methods into different groups (i.e., direct query, tokenization, prompt design, fine-tune, and model integration), and highlight the key ideas within each group. We also discuss the applications of LLMs for both general and spatial-temporal time series data, tailored to specific domains. Finally, we thoroughly discuss future research opportunities to empower time series analysis with LLMs.
