In-Context and Few-Shots Learning for Forecasting Time Series Data based on Large Language Models
Saroj Gopali, Bipin Chhetri, Deepika Giri, Sima Siami-Namini, Akbar Siami Namin
TL;DR
The paper addresses the challenge of forecasting time series with limited labeled data by evaluating in-context, zero-shot, and few-shot learning using large language models alongside traditional DL models. It conducts a systematic comparison across TimesFM, OpenAI o4-mini, Gemini 2.5 Flash Lite, TCN, and LSTM on the SWaT dataset, finding TimesFM to have the best RMSE of $0.3025$ and MAE of $0.2127$ with a mean inference time of $266$ seconds, while zero-shot o4-mini remains competitive albeit with much higher latency. The study demonstrates the promise of time-series foundation models for accurate, scalable forecasting with minimal adaptation, and highlights trade-offs in explainability and deployment cost. It also discusses limitations such as the single-dataset focus and prompting sensitivity, proposing future work in model optimization and uncertainty-aware forecasting to bolster reliability in industrial settings.
Abstract
Existing data-driven approaches in modeling and predicting time series data include ARIMA (Autoregressive Integrated Moving Average), Transformer-based models, LSTM (Long Short-Term Memory) and TCN (Temporal Convolutional Network). These approaches, and in particular deep learning-based models such as LSTM and TCN, have shown great results in predicting time series data. With the advancement of leveraging pre-trained foundation models such as Large Language Models (LLMs) and more notably Google's recent foundation model for time series data, {\it TimesFM} (Time Series Foundation Model), it is of interest to investigate whether these foundation models have the capability of outperforming existing modeling approaches in analyzing and predicting time series data. This paper investigates the performance of using LLM models for time series data prediction. We investigate the in-context learning methodology in the training of LLM models that are specific to the underlying application domain. More specifically, the paper explores training LLMs through in-context, zero-shot and few-shot learning and forecasting time series data with OpenAI {\tt o4-mini} and Gemini 2.5 Flash Lite, as well as the recent Google's Transformer-based TimesFM, a time series-specific foundation model, along with two deep learning models, namely TCN and LSTM networks. The findings indicate that TimesFM has the best overall performance with the lowest RMSE value (0.3023) and the competitive inference time (266 seconds). Furthermore, OpenAI's o4-mini also exhibits a good performance based on Zero Shot learning. These findings highlight pre-trained time series foundation models as a promising direction for real-time forecasting, enabling accurate and scalable deployment with minimal model adaptation.
