Table of Contents
Fetching ...

VITRO: Vocabulary Inversion for Time-series Representation Optimization

Filippos Bellos, Nam H. Nguyen, Jason J. Corso

TL;DR

VITRO tackles the mismatch between pre-trained LLM vocabularies and time-series data by learning a dataset-specific, data-centric vocabulary through textual inversion. The method has two stages: Stage 1 builds a per-dataset vocabulary via learnable pseudo-words $v_i$ and a shared embedding $s$, while Stage 2 applies this vocabulary to forecasting using two frozen-LLM pipelines (Similarity-based core lexicon selection and TimeLLM cross-attention). Across seven public datasets and multiple forecasting horizons, VITRO-enhanced approaches achieve state-of-the-art or competitive results in long-term forecasting, with qualitative analyses showing a structured embedding space and interpretable attention patterns. This approach enables leveraging frozen LLMs for time-series forecasting with reduced fine-tuning and highlights future opportunities to extend vocabulary inversion to other time-series tasks and to optimize computational cost.

Abstract

Although LLMs have demonstrated remarkable capabilities in processing and generating textual data, their pre-trained vocabularies are ill-suited for capturing the nuanced temporal dynamics and patterns inherent in time series. The discrete, symbolic nature of natural language tokens, which these vocabularies are designed to represent, does not align well with the continuous, numerical nature of time series data. To address this fundamental limitation, we propose VITRO. Our method adapts textual inversion optimization from the vision-language domain in order to learn a new time series per-dataset vocabulary that bridges the gap between the discrete, semantic nature of natural language and the continuous, numerical nature of time series data. We show that learnable time series-specific pseudo-word embeddings represent time series data better than existing general language model vocabularies, with VITRO-enhanced methods achieving state-of-the-art performance in long-term forecasting across most datasets.

VITRO: Vocabulary Inversion for Time-series Representation Optimization

TL;DR

VITRO tackles the mismatch between pre-trained LLM vocabularies and time-series data by learning a dataset-specific, data-centric vocabulary through textual inversion. The method has two stages: Stage 1 builds a per-dataset vocabulary via learnable pseudo-words and a shared embedding , while Stage 2 applies this vocabulary to forecasting using two frozen-LLM pipelines (Similarity-based core lexicon selection and TimeLLM cross-attention). Across seven public datasets and multiple forecasting horizons, VITRO-enhanced approaches achieve state-of-the-art or competitive results in long-term forecasting, with qualitative analyses showing a structured embedding space and interpretable attention patterns. This approach enables leveraging frozen LLMs for time-series forecasting with reduced fine-tuning and highlights future opportunities to extend vocabulary inversion to other time-series tasks and to optimize computational cost.

Abstract

Although LLMs have demonstrated remarkable capabilities in processing and generating textual data, their pre-trained vocabularies are ill-suited for capturing the nuanced temporal dynamics and patterns inherent in time series. The discrete, symbolic nature of natural language tokens, which these vocabularies are designed to represent, does not align well with the continuous, numerical nature of time series data. To address this fundamental limitation, we propose VITRO. Our method adapts textual inversion optimization from the vision-language domain in order to learn a new time series per-dataset vocabulary that bridges the gap between the discrete, semantic nature of natural language and the continuous, numerical nature of time series data. We show that learnable time series-specific pseudo-word embeddings represent time series data better than existing general language model vocabularies, with VITRO-enhanced methods achieving state-of-the-art performance in long-term forecasting across most datasets.

Paper Structure

This paper contains 11 sections, 3 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: VITRO optimizes learnable pseudo-word embeddings $v_i$ for each time series instance $X_i$ and a shared dataset embedding $s$ to construct a new data-centric time series vocabulary tailored for forecasting. Time series are normalized, patched, and embedded. These patch embeddings $E_i$ serve as prompts to guide the optimization of pseudo-words. The composite representation, including statistical features $e_{stats}$, is fed into a frozen LLM, whose output is projected to generate forecasts $\hat{Y}_i$.
  • Figure 2: PCA and t-SNE visualizations of VITRO and existing general-purpose vocabulary embedding space.
  • Figure 3: VITRO and LLM existing vocabularies heatmaps. Each row corresponds to a word in the vocabulary, the y-axis represents the index of the word, and the x-axis denotes the embedding dimensions. Brighter colors indicate higher values.