Table of Contents
Fetching ...

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

Yubin Kim, Viresh Pati, Jevon Twitty, Vinh Pham, Shihao Yang, Jiecheng Lu

TL;DR

StretchTime tackles non-stationary time-warped dynamics by learning adaptive temporal warping through Symplectic Positional Embeddings (SyPE), a generalization of Rotary Positional Embeddings within the symplectic group $\mathrm{Sp}(2,\mathbb{R})$. The method combines a differentiable adaptive warp module with a symplectic flow, enabling end-to-end dilation or compression of temporal coordinates and robust handling of locally varying periodicities. Empirically, StretchTime achieves state-of-the-art results across diverse multivariate forecasting benchmarks, with pronounced advantages on datasets exhibiting non-stationary temporal dynamics, while maintaining high parameter efficiency and lower computational cost than several baselines. This work provides a principled, geometry-inspired alternative to fixed-frequency encodings, improving robustness to time-warping and opening avenues for applying symplectic representations to broader sequence modeling tasks.

Abstract

Transformer architectures have established strong baselines in time series forecasting, yet they typically rely on positional encodings that assume uniform, index-based temporal progression. However, real-world systems, from shifting financial cycles to elastic biological rhythms, frequently exhibit "time-warped" dynamics where the effective flow of time decouples from the sampling index. In this work, we first formalize this misalignment and prove that rotary position embedding (RoPE) is mathematically incapable of representing non-affine temporal warping. To address this, we propose Symplectic Positional Embeddings (SyPE), a learnable encoding framework derived from Hamiltonian mechanics. SyPE strictly generalizes RoPE by extending the rotation group $\mathrm{SO}(2)$ to the symplectic group $\mathrm{Sp}(2,\mathbb{R})$, modulated by a novel input-dependent adaptive warp module. By allowing the attention mechanism to adaptively dilate or contract temporal coordinates end-to-end, our approach captures locally varying periodicities without requiring pre-defined warping functions. We implement this mechanism in StretchTime, a multivariate forecasting architecture that achieves state-of-the-art performance on standard benchmarks, demonstrating superior robustness on datasets exhibiting non-stationary temporal dynamics.

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

TL;DR

StretchTime tackles non-stationary time-warped dynamics by learning adaptive temporal warping through Symplectic Positional Embeddings (SyPE), a generalization of Rotary Positional Embeddings within the symplectic group . The method combines a differentiable adaptive warp module with a symplectic flow, enabling end-to-end dilation or compression of temporal coordinates and robust handling of locally varying periodicities. Empirically, StretchTime achieves state-of-the-art results across diverse multivariate forecasting benchmarks, with pronounced advantages on datasets exhibiting non-stationary temporal dynamics, while maintaining high parameter efficiency and lower computational cost than several baselines. This work provides a principled, geometry-inspired alternative to fixed-frequency encodings, improving robustness to time-warping and opening avenues for applying symplectic representations to broader sequence modeling tasks.

Abstract

Transformer architectures have established strong baselines in time series forecasting, yet they typically rely on positional encodings that assume uniform, index-based temporal progression. However, real-world systems, from shifting financial cycles to elastic biological rhythms, frequently exhibit "time-warped" dynamics where the effective flow of time decouples from the sampling index. In this work, we first formalize this misalignment and prove that rotary position embedding (RoPE) is mathematically incapable of representing non-affine temporal warping. To address this, we propose Symplectic Positional Embeddings (SyPE), a learnable encoding framework derived from Hamiltonian mechanics. SyPE strictly generalizes RoPE by extending the rotation group to the symplectic group , modulated by a novel input-dependent adaptive warp module. By allowing the attention mechanism to adaptively dilate or contract temporal coordinates end-to-end, our approach captures locally varying periodicities without requiring pre-defined warping functions. We implement this mechanism in StretchTime, a multivariate forecasting architecture that achieves state-of-the-art performance on standard benchmarks, demonstrating superior robustness on datasets exhibiting non-stationary temporal dynamics.
Paper Structure (50 sections, 4 theorems, 22 equations, 3 figures, 6 tables)

This paper contains 50 sections, 4 theorems, 22 equations, 3 figures, 6 tables.

Key Result

Theorem 3.1

Let $\tau: \{1, \dots, N\} \to \mathbb{R}_+$ be a non-affine function. Assume the non-aliasing condition $|\omega_0(\tau(t+1) - \tau(t))| < \pi$ for all $t$. Then there exists no $\theta \in \mathbb{R}$ satisfying the RoPE relative position property:

Figures (3)

  • Figure 1: Visualization of Temporal Stretching
  • Figure 2: Overview of the SyPE-Augmented Transformer architecture.
  • Figure 3: Forecast visualization on warped seasonal dynamics. StretchTime (left) corrects the phase alignment errors observed in the static RoPE baseline (right).

Theorems & Definitions (11)

  • Theorem 3.1: Impossibility of RoPE for Non-Affine Warping
  • proof
  • Theorem 3.2: SyPE Representations of Warped Time
  • proof
  • proof
  • Proposition 2.1: Inconsistency of RPE with Warped Time
  • proof
  • Remark 2.2: Intuition
  • Proposition 2.3: Impossibility of Shared APE for Heterogeneous Warping
  • proof
  • ...and 1 more