Table of Contents
Fetching ...

STELLA: Guiding Large Language Models for Time Series Forecasting with Semantic Abstractions

Junjie Fan, Hongye Zhao, Linduo Wei, Jiayu Rao, Guijia Li, Jiaxin Yuan, Wenqi Xu, Yong Qi

TL;DR

STELLA addresses the information bottleneck in adapting LLMs to time-series forecasting by generating dynamic semantic anchors (CSP and FBP) from a Neural STL decomposition and using them as prompts to a decoder-only Transformer with LoRA. It demonstrates state-of-the-art performance across eight benchmarks for long- and short-term forecasting and shows strong zero-shot and few-shot generalization, validated by ablations and disentanglement analyses. The work argues for semantic-guided learning as a principled path to harness LLMs for quantitative reasoning on sequential data. Overall, STELLA establishes a compelling paradigm where generative, instance-specific semantic cues are leveraged to substantially improve LLM-based time-series forecasting.

Abstract

Recent adaptations of Large Language Models (LLMs) for time series forecasting often fail to effectively enhance information for raw series, leaving LLM reasoning capabilities underutilized. Existing prompting strategies rely on static correlations rather than generative interpretations of dynamic behavior, lacking critical global and instance-specific context. To address this, we propose STELLA (Semantic-Temporal Alignment with Language Abstractions), a framework that systematically mines and injects structured supplementary and complementary information. STELLA employs a dynamic semantic abstraction mechanism that decouples input series into trend, seasonality, and residual components. It then translates intrinsic behavioral features of these components into Hierarchical Semantic Anchors: a Corpus-level Semantic Prior (CSP) for global context and a Fine-grained Behavioral Prompt (FBP) for instance-level patterns. Using these anchors as prefix-prompts, STELLA guides the LLM to model intrinsic dynamics. Experiments on eight benchmark datasets demonstrate that STELLA outperforms state-of-the-art methods in long- and short-term forecasting, showing superior generalization in zero-shot and few-shot settings. Ablation studies further validate the effectiveness of our dynamically generated semantic anchors.

STELLA: Guiding Large Language Models for Time Series Forecasting with Semantic Abstractions

TL;DR

STELLA addresses the information bottleneck in adapting LLMs to time-series forecasting by generating dynamic semantic anchors (CSP and FBP) from a Neural STL decomposition and using them as prompts to a decoder-only Transformer with LoRA. It demonstrates state-of-the-art performance across eight benchmarks for long- and short-term forecasting and shows strong zero-shot and few-shot generalization, validated by ablations and disentanglement analyses. The work argues for semantic-guided learning as a principled path to harness LLMs for quantitative reasoning on sequential data. Overall, STELLA establishes a compelling paradigm where generative, instance-specific semantic cues are leveraged to substantially improve LLM-based time-series forecasting.

Abstract

Recent adaptations of Large Language Models (LLMs) for time series forecasting often fail to effectively enhance information for raw series, leaving LLM reasoning capabilities underutilized. Existing prompting strategies rely on static correlations rather than generative interpretations of dynamic behavior, lacking critical global and instance-specific context. To address this, we propose STELLA (Semantic-Temporal Alignment with Language Abstractions), a framework that systematically mines and injects structured supplementary and complementary information. STELLA employs a dynamic semantic abstraction mechanism that decouples input series into trend, seasonality, and residual components. It then translates intrinsic behavioral features of these components into Hierarchical Semantic Anchors: a Corpus-level Semantic Prior (CSP) for global context and a Fine-grained Behavioral Prompt (FBP) for instance-level patterns. Using these anchors as prefix-prompts, STELLA guides the LLM to model intrinsic dynamics. Experiments on eight benchmark datasets demonstrate that STELLA outperforms state-of-the-art methods in long- and short-term forecasting, showing superior generalization in zero-shot and few-shot settings. Ablation studies further validate the effectiveness of our dynamically generated semantic anchors.

Paper Structure

This paper contains 35 sections, 26 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: An overview of the STELLA framework. STELLA first decomposes the input series and then utilizes a dual-path architecture: a TC-Patch Encoder generates numerical embeddings, while a Semantic Anchor Module generates hierarchical prompts (CSP and FBP) to guide a frozen LLM backbone for forecasting.
  • Figure 2: UMAP visualization of Semantic-Driven Disentanglement. The disentanglement of final component representations (a) is shown to be a direct result of the clear separation of their guiding Hierarchical Semantic Anchors (b).
  • Figure 3: Qualitative forecasting results of our model (STELLA) against TimeLLM on the ETTh1 dataset for prediction horizons $H=96$ (a) and $H=192$ (b).
  • Figure 4: Qualitative forecasting results of our model (STELLA) against TimeLLM on the ETTh2 dataset for prediction horizons $H=96$ (a) and $H=192$ (b).
  • Figure 5: Qualitative forecasting results of our model (STELLA) against TimeLLM on the ETTm1 dataset for prediction horizons $H=96$ (a) and $H=192$ (b).
  • ...and 4 more figures