Time-TK: A Multi-Offset Temporal Interaction Framework Combining Transformer and Kolmogorov-Arnold Networks for Time Series Forecasting
Fan Zhang, Shiming Fan, Hua Wang
TL;DR
Time-TK addresses the bottleneck of independent per-step embeddings in long-horizon forecasting by introducing a multi-offset paradigm that jointly leverages a Multi-Offset Token Embedding (MOTE), a Multi-Offset Interactive KAN (MI-KAN) with Gaussian RBFs, and a Multi-Offset Temporal Interaction (MOTI). The architecture fuses multiple offset sub-sequences with the original series through a global interaction mechanism, yielding a lightweight yet powerful predictor for $ extbf{Y}$ from historical data $\mathcal{X} \in \mathbb{R}^{N \times \mathcal{L}}$ to forecast $\hat{\mathcal{Y}} \in \mathbb{R}^{N \times \mathcal{F}}$. Across 14 real-world datasets, Time-TK achieves state-of-the-art results, ranking first in 23 of 26 settings and showing statistically significant improvements over strong baselines, while maintaining favorable memory and computational efficiency. The work also demonstrates that integrating MOTE into other architectures yields consistent gains, underlining the method's generality for scalable long-term time-series forecasting in web-scale environments.
Abstract
Time series forecasting is crucial for the World Wide Web and represents a core technical challenge in ensuring the stable and efficient operation of modern web services, such as intelligent transportation and website throughput. However, we have found that existing methods typically employ a strategy of embedding each time step as an independent token. This paradigm introduces a fundamental information bottleneck when processing long sequences, the root cause of which is that independent token embedding destroys a crucial structure within the sequence - what we term as multi-offset temporal correlation. This refers to the fine-grained dependencies embedded within the sequence that span across different time steps, which is especially prevalent in regular Web data. To fundamentally address this issue, we propose a new perspective on time series embedding. We provide an upper bound on the approximate reconstruction performance of token embedding, which guides our design of a concise yet effective Multi-Offset Time Embedding method to mitigate the performance degradation caused by standard token embedding. Furthermore, our MOTE can be integrated into various existing models and serve as a universal building block. Based on this paradigm, we further design a novel forecasting architecture named Time-TK. This architecture first utilizes a Multi-Offset Interactive KAN to learn and represent specific temporal patterns among multiple offset sub-sequences. Subsequently, it employs an efficient Multi-Offset Temporal Interaction mechanism to effectively capture the complex dependencies between these sub-sequences, achieving global information integration. Extensive experiments on 14 real-world benchmark datasets, covering domains such as traffic flow and BTC/USDT throughput, demonstrate that Time-TK significantly outperforms all baseline models, achieving state-of-the-art forecasting accuracy.
