StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

Yubin Kim; Viresh Pati; Jevon Twitty; Vinh Pham; Shihao Yang; Jiecheng Lu

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

Yubin Kim, Viresh Pati, Jevon Twitty, Vinh Pham, Shihao Yang, Jiecheng Lu

TL;DR

StretchTime tackles non-stationary time-warped dynamics by learning adaptive temporal warping through Symplectic Positional Embeddings (SyPE), a generalization of Rotary Positional Embeddings within the symplectic group $\mathrm{Sp}(2,\mathbb{R})$. The method combines a differentiable adaptive warp module with a symplectic flow, enabling end-to-end dilation or compression of temporal coordinates and robust handling of locally varying periodicities. Empirically, StretchTime achieves state-of-the-art results across diverse multivariate forecasting benchmarks, with pronounced advantages on datasets exhibiting non-stationary temporal dynamics, while maintaining high parameter efficiency and lower computational cost than several baselines. This work provides a principled, geometry-inspired alternative to fixed-frequency encodings, improving robustness to time-warping and opening avenues for applying symplectic representations to broader sequence modeling tasks.

Abstract

Transformer architectures have established strong baselines in time series forecasting, yet they typically rely on positional encodings that assume uniform, index-based temporal progression. However, real-world systems, from shifting financial cycles to elastic biological rhythms, frequently exhibit "time-warped" dynamics where the effective flow of time decouples from the sampling index. In this work, we first formalize this misalignment and prove that rotary position embedding (RoPE) is mathematically incapable of representing non-affine temporal warping. To address this, we propose Symplectic Positional Embeddings (SyPE), a learnable encoding framework derived from Hamiltonian mechanics. SyPE strictly generalizes RoPE by extending the rotation group $\mathrm{SO}(2)$ to the symplectic group $\mathrm{Sp}(2,\mathbb{R})$, modulated by a novel input-dependent adaptive warp module. By allowing the attention mechanism to adaptively dilate or contract temporal coordinates end-to-end, our approach captures locally varying periodicities without requiring pre-defined warping functions. We implement this mechanism in StretchTime, a multivariate forecasting architecture that achieves state-of-the-art performance on standard benchmarks, demonstrating superior robustness on datasets exhibiting non-stationary temporal dynamics.

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

TL;DR

. The method combines a differentiable adaptive warp module with a symplectic flow, enabling end-to-end dilation or compression of temporal coordinates and robust handling of locally varying periodicities. Empirically, StretchTime achieves state-of-the-art results across diverse multivariate forecasting benchmarks, with pronounced advantages on datasets exhibiting non-stationary temporal dynamics, while maintaining high parameter efficiency and lower computational cost than several baselines. This work provides a principled, geometry-inspired alternative to fixed-frequency encodings, improving robustness to time-warping and opening avenues for applying symplectic representations to broader sequence modeling tasks.

Abstract

to the symplectic group

, modulated by a novel input-dependent adaptive warp module. By allowing the attention mechanism to adaptively dilate or contract temporal coordinates end-to-end, our approach captures locally varying periodicities without requiring pre-defined warping functions. We implement this mechanism in StretchTime, a multivariate forecasting architecture that achieves state-of-the-art performance on standard benchmarks, demonstrating superior robustness on datasets exhibiting non-stationary temporal dynamics.

Paper Structure (50 sections, 4 theorems, 22 equations, 3 figures, 6 tables)

This paper contains 50 sections, 4 theorems, 22 equations, 3 figures, 6 tables.

Introduction
Related Work
Transformers for Time Series
Positional Encodings
Time-Varying Periodicity in Real-World Systems
Time-Varying Periodicity in Classical Statistics
Methodology
Task, Notation, and Architectural Overview
Data Structure: Temporally Warped Seasonal Dynamics
Single-Layer Self-Attention with Position Modulation
Impossibility of Standard RoPE
Method: Symplectic Positional Embeddings (SyPE)
Symplectic Flow Formulation.
Structured Generalization of RoPE.
Adaptive Warp Module.
...and 35 more sections

Key Result

Theorem 3.1

Let $\tau: \{1, \dots, N\} \to \mathbb{R}_+$ be a non-affine function. Assume the non-aliasing condition $|\omega_0(\tau(t+1) - \tau(t))| < \pi$ for all $t$. Then there exists no $\theta \in \mathbb{R}$ satisfying the RoPE relative position property:

Figures (3)

Figure 1: Visualization of Temporal Stretching
Figure 2: Overview of the SyPE-Augmented Transformer architecture.
Figure 3: Forecast visualization on warped seasonal dynamics. StretchTime (left) corrects the phase alignment errors observed in the static RoPE baseline (right).

Theorems & Definitions (11)

Theorem 3.1: Impossibility of RoPE for Non-Affine Warping
proof
Theorem 3.2: SyPE Representations of Warped Time
proof
proof
Proposition 2.1: Inconsistency of RPE with Warped Time
proof
Remark 2.2: Intuition
Proposition 2.3: Impossibility of Shared APE for Heterogeneous Warping
proof
...and 1 more

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

TL;DR

Abstract

StretchTime: Adaptive Time Series Forecasting via Symplectic Attention

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (11)