Table of Contents
Fetching ...

EIDOS: Latent-Space Predictive Learning for Time Series Foundation Models

Xinxing Zhou, Qingren Yao, Yiji Zhao, Chenghao Liu, Flora Salim, Xiaojie Yuan, Yanlong Wen, Ming Jin

TL;DR

This work introduces EIDOS, a foundation model family that shifts pretraining from future value prediction to latent-space predictive learning, and trains a causal Transformer to predict the evolution of latent representations, encouraging the emergence of structured and temporally coherent latent states.

Abstract

Most time series foundation models are pretrained by directly predicting future observations, which often yields weakly structured latent representations that capture surface noise rather than coherent and predictable temporal dynamics. In this work, we introduce EIDOS, a foundation model family that shifts pretraining from future value prediction to latent-space predictive learning. We train a causal Transformer to predict the evolution of latent representations, encouraging the emergence of structured and temporally coherent latent states. To ensure stable targets for latent-space learning, we design a lightweight aggregation branch to construct target representations. EIDOS is optimized via a joint objective that integrates latent-space alignment, observational grounding to anchor representations to the input signal, and direct forecasting supervision. On the GIFT-Eval benchmark, EIDOS mitigates structural fragmentation in the representation space and achieves state-of-the-art performance. These results demonstrate that constraining models to learn predictable latent dynamics is a principled step toward more robust and reliable time series foundation models.

EIDOS: Latent-Space Predictive Learning for Time Series Foundation Models

TL;DR

This work introduces EIDOS, a foundation model family that shifts pretraining from future value prediction to latent-space predictive learning, and trains a causal Transformer to predict the evolution of latent representations, encouraging the emergence of structured and temporally coherent latent states.

Abstract

Most time series foundation models are pretrained by directly predicting future observations, which often yields weakly structured latent representations that capture surface noise rather than coherent and predictable temporal dynamics. In this work, we introduce EIDOS, a foundation model family that shifts pretraining from future value prediction to latent-space predictive learning. We train a causal Transformer to predict the evolution of latent representations, encouraging the emergence of structured and temporally coherent latent states. To ensure stable targets for latent-space learning, we design a lightweight aggregation branch to construct target representations. EIDOS is optimized via a joint objective that integrates latent-space alignment, observational grounding to anchor representations to the input signal, and direct forecasting supervision. On the GIFT-Eval benchmark, EIDOS mitigates structural fragmentation in the representation space and achieves state-of-the-art performance. These results demonstrate that constraining models to learn predictable latent dynamics is a principled step toward more robust and reliable time series foundation models.
Paper Structure (51 sections, 12 equations, 12 figures, 8 tables)

This paper contains 51 sections, 12 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Latent-space (Eidos) versus observation-space (Chronos-2) prediction. (Left) Latent Representations: UMAP visualization of the final-layer embeddings from each model under clean and noisy inputs, using time series with increasing frequencies. Eidos maintains a smooth and ordered manifold under noise, while Chronos-2 exhibits fragmented structure. (Right) Noise Robustness: When forecasting under burst noise, the structured latent space of Eidos enables stable predictions, showing reduced sensitivity to surface-level noise observed in Chronos-2.
  • Figure 2: Comparison of predictive mechanisms. (a) Next Token Prediction: Direct prediction of future observations $y$ in the raw observation space. (b) JEPA: Latent-space alignment requiring a separate $y$-encoder to generate target representations $h_y$. (c) Ours: Latent-space prediction with a unified encoder, where target representations remain anchored to the raw observation $y$.
  • Figure 3: Latent-Space Predictive Learning
  • Figure 4: Architecture of Eidos with forecasting length $l=2$. The input time series is mapped to point-wise latent embeddings by SiGLU and processed by a causal Transformer to predict future latent states. A lightweight aggregator constructs target states from future segments, and predictions are aligned in the latent space. The quantile head operates on these latent representations to produce probabilistic forecasts over multiple future steps.
  • Figure 5: Performance on the GIFT-Eval benchmark across 97 evaluation configurations. Results are presented as Normalized MASE and Normalized CRPS where lower values indicate better accuracy. The legend distinguishes between different training paradigms. "Zero-Shot" models have no prior exposure to the benchmark data. "In-Domain" indicates that the benchmark training sets were included in the pre-training corpus, while "TestData Leakage" denotes models partially trained on the benchmark test sets. "Task Specific" and "Statistical" methods represent supervised and traditional baselines.
  • ...and 7 more figures