Table of Contents
Fetching ...

On the Internal Semantics of Time-Series Foundation Models

Atharva Pandey, Abhilash Neog, Gautam Jajoo

TL;DR

This work probes how Time-Series Foundation Models internalize fundamental temporal phenomena. It adopts a concept-centric, probe-based framework using linear probes, structural probes, and Centered Kernel Alignment (CKA) across seven canonical concepts ($AR(1)$, Level Shift, Random Walk, Spectral, $Time-Warped$, Trend, Variance Shift) to map locality, recoverability, and depth-wise abstraction. Key findings show early layers capturing local time-domain structure while deeper layers specialize in dispersion and change-point signals; spectral and time-warping remain hard to recover linearly. The study also reveals generally near-linear compositionality for atomic concepts, with interference in some pairs, and demonstrates that Chronos tends to produce more organized, separable representations than MOMENT. These insights guide architectural and evaluation choices for TSFMs and motivate future work on non-linear or causal probes to capture interacting temporal phenomena.

Abstract

Time-series Foundation Models (TSFMs) have recently emerged as a universal paradigm for learning across diverse temporal domains. However, despite their empirical success, the internal mechanisms by which these models represent fundamental time-series concepts remain poorly understood. In this work, we undertake a systematic investigation of concept interpretability in TSFMs. Specifically, we examine: (i) which layers encode which concepts, (ii) whether concept parameters are linearly recoverable, (iii) how representations evolve in terms of concept disentanglement and abstraction across model depth, and (iv) how models process compositions of concepts. We systematically probe these questions using layer-wise analyses, linear recoverability tests, and representation similarity measures, providing a structured account of TSFM semantics. The resulting insights show that early layers mainly capture local, time-domain patterns (e.g., AR(1), level shifts, trends), while deeper layers encode dispersion and change-time signals, with spectral and warping factors remaining the hardest to recover linearly. In compositional settings, however, probe performance degrades, revealing interference between concepts. This highlights that while atomic concepts are reliably localized, composition remains a challenge, underscoring a key limitation in current TSFMs' ability to represent interacting temporal phenomena.

On the Internal Semantics of Time-Series Foundation Models

TL;DR

This work probes how Time-Series Foundation Models internalize fundamental temporal phenomena. It adopts a concept-centric, probe-based framework using linear probes, structural probes, and Centered Kernel Alignment (CKA) across seven canonical concepts (, Level Shift, Random Walk, Spectral, , Trend, Variance Shift) to map locality, recoverability, and depth-wise abstraction. Key findings show early layers capturing local time-domain structure while deeper layers specialize in dispersion and change-point signals; spectral and time-warping remain hard to recover linearly. The study also reveals generally near-linear compositionality for atomic concepts, with interference in some pairs, and demonstrates that Chronos tends to produce more organized, separable representations than MOMENT. These insights guide architectural and evaluation choices for TSFMs and motivate future work on non-linear or causal probes to capture interacting temporal phenomena.

Abstract

Time-series Foundation Models (TSFMs) have recently emerged as a universal paradigm for learning across diverse temporal domains. However, despite their empirical success, the internal mechanisms by which these models represent fundamental time-series concepts remain poorly understood. In this work, we undertake a systematic investigation of concept interpretability in TSFMs. Specifically, we examine: (i) which layers encode which concepts, (ii) whether concept parameters are linearly recoverable, (iii) how representations evolve in terms of concept disentanglement and abstraction across model depth, and (iv) how models process compositions of concepts. We systematically probe these questions using layer-wise analyses, linear recoverability tests, and representation similarity measures, providing a structured account of TSFM semantics. The resulting insights show that early layers mainly capture local, time-domain patterns (e.g., AR(1), level shifts, trends), while deeper layers encode dispersion and change-time signals, with spectral and warping factors remaining the hardest to recover linearly. In compositional settings, however, probe performance degrades, revealing interference between concepts. This highlights that while atomic concepts are reliably localized, composition remains a challenge, underscoring a key limitation in current TSFMs' ability to represent interacting temporal phenomena.

Paper Structure

This paper contains 49 sections, 21 equations, 49 figures.

Figures (49)

  • Figure 1: UMAP of pooled embeddings at early, mid, and late layers, time-warp concept.
  • Figure 2: Layer-wise probe (y-axis MSE; x-axis layers) for Chronos (left) and MOMENT (right). Each curve represents a concept
  • Figure 3: Context Length ablations on MOMENT
  • Figure 4: Vector arithmetic experiments with Chronos. Atomic embeddings combine nearly linearly ($\mathbf{emb_{1}} + \mathbf{emb_{2}} \approx \mathbf{emb_{3}}$), except for temporally disparate concept pairs.
  • Figure 5: Chronos – Temporal alignment experiments. We show stability of compositional relationships across multiple atomic-concept pairs.
  • ...and 44 more figures