ReCycle: Fast and Efficient Long Time Series Forecasting with Residual Cyclic Transformers
Arvid Weyrauch, Thomas Steens, Oskar Taubert, Benedikt Hanke, Aslan Eqbal, Ewa Götz, Achim Streit, Markus Götz, Charlotte Debus
TL;DR
ReCycle introduces Primary Cycle Compression (PCC) and residual learning from Recent Historic Profiles (RHP) to enable fast, energy-efficient long time series forecasting with Transformer architectures. By transforming univariate series into cycle-based representations and learning only residuals on top of cycle patterns, ReCycle reduces the dominant $\,O(L^2)\,$ attention cost while preserving or improving predictive accuracy. Extensive experiments across multiple Transformer backbones and five datasets demonstrate substantial reductions in training time and energy, with robust fallback behavior when periodic components dominate; results suggest ReCycle makes state-of-the-art forecasting more practical for real-world, resource-constrained environments. The method is compatible with existing architectures and can be deployed on edge devices, addressing both performance and sustainability concerns in AI for critical infrastructure forecasting.
Abstract
Transformers have recently gained prominence in long time series forecasting by elevating accuracies in a variety of use cases. Regrettably, in the race for better predictive performance the overhead of model architectures has grown onerous, leading to models with computational demand infeasible for most practical applications. To bridge the gap between high method complexity and realistic computational resources, we introduce the Residual Cyclic Transformer, ReCycle. ReCycle utilizes primary cycle compression to address the computational complexity of the attention mechanism in long time series. By learning residuals from refined smoothing average techniques, ReCycle surpasses state-of-the-art accuracy in a variety of application use cases. The reliable and explainable fallback behavior ensured by simple, yet robust, smoothing average techniques additionally lowers the barrier for user acceptance. At the same time, our approach reduces the run time and energy consumption by more than an order of magnitude, making both training and inference feasible on low-performance, low-power and edge computing devices. Code is available at https://github.com/Helmholtz-AI-Energy/ReCycle
