Table of Contents
Fetching ...

RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks

Haowen Hou, F. Richard Yu

TL;DR

RWKV-TS introduces a linear-time, encoder-based RNN for time-series tasks that uses time- and channel-mixing RWKV blocks to achieve competitive performance with lower latency and memory than Transformer- and CNN-based baselines. The architecture combines instance-normalized patching, a multi-head WKV operator, and gated output, yielding an encoder-only model capable of long-range dependency capture. Across long- and short-term forecasting, few-shot learning, anomaly detection, imputation, and classification, RWKV-TS attains strong results while maintaining scalability, motivating further research into efficient RNN-inspired time-series models. The work provides open-source code and highlights the continued viability of RNN-based approaches for real-world time-series applications.

Abstract

Traditional Recurrent Neural Network (RNN) architectures, such as LSTM and GRU, have historically held prominence in time series tasks. However, they have recently seen a decline in their dominant position across various time series tasks. As a result, recent advancements in time series forecasting have seen a notable shift away from RNNs towards alternative architectures such as Transformers, MLPs, and CNNs. To go beyond the limitations of traditional RNNs, we design an efficient RNN-based model for time series tasks, named RWKV-TS, with three distinctive features: (i) A novel RNN architecture characterized by $O(L)$ time complexity and memory usage. (ii) An enhanced ability to capture long-term sequence information compared to traditional RNNs. (iii) High computational efficiency coupled with the capacity to scale up effectively. Through extensive experimentation, our proposed RWKV-TS model demonstrates competitive performance when compared to state-of-the-art Transformer-based or CNN-based models. Notably, RWKV-TS exhibits not only comparable performance but also demonstrates reduced latency and memory utilization. The success of RWKV-TS encourages further exploration and innovation in leveraging RNN-based approaches within the domain of Time Series. The combination of competitive performance, low latency, and efficient memory usage positions RWKV-TS as a promising avenue for future research in time series tasks. Code is available at:\href{https://github.com/howard-hou/RWKV-TS}{ https://github.com/howard-hou/RWKV-TS}

RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks

TL;DR

RWKV-TS introduces a linear-time, encoder-based RNN for time-series tasks that uses time- and channel-mixing RWKV blocks to achieve competitive performance with lower latency and memory than Transformer- and CNN-based baselines. The architecture combines instance-normalized patching, a multi-head WKV operator, and gated output, yielding an encoder-only model capable of long-range dependency capture. Across long- and short-term forecasting, few-shot learning, anomaly detection, imputation, and classification, RWKV-TS attains strong results while maintaining scalability, motivating further research into efficient RNN-inspired time-series models. The work provides open-source code and highlights the continued viability of RNN-based approaches for real-world time-series applications.

Abstract

Traditional Recurrent Neural Network (RNN) architectures, such as LSTM and GRU, have historically held prominence in time series tasks. However, they have recently seen a decline in their dominant position across various time series tasks. As a result, recent advancements in time series forecasting have seen a notable shift away from RNNs towards alternative architectures such as Transformers, MLPs, and CNNs. To go beyond the limitations of traditional RNNs, we design an efficient RNN-based model for time series tasks, named RWKV-TS, with three distinctive features: (i) A novel RNN architecture characterized by time complexity and memory usage. (ii) An enhanced ability to capture long-term sequence information compared to traditional RNNs. (iii) High computational efficiency coupled with the capacity to scale up effectively. Through extensive experimentation, our proposed RWKV-TS model demonstrates competitive performance when compared to state-of-the-art Transformer-based or CNN-based models. Notably, RWKV-TS exhibits not only comparable performance but also demonstrates reduced latency and memory utilization. The success of RWKV-TS encourages further exploration and innovation in leveraging RNN-based approaches within the domain of Time Series. The combination of competitive performance, low latency, and efficient memory usage positions RWKV-TS as a promising avenue for future research in time series tasks. Code is available at:\href{https://github.com/howard-hou/RWKV-TS}{ https://github.com/howard-hou/RWKV-TS}
Paper Structure (49 sections, 9 equations, 2 figures, 14 tables)

This paper contains 49 sections, 9 equations, 2 figures, 14 tables.

Figures (2)

  • Figure 1: RWKV-TS is a time-series RNN-based model that achieves both strong performance and efficiency simultaneously. In contrast, other RNN models are considered to perform poorly in both aspects for time-series tasks.
  • Figure 2: Architecture of the RWKV-TS. RWKV-TS comprises three main components: an input module, RWKV backbone, and an output module. Firstly, the input module applies instance normalization to each channel's univariate series and segments them into patches. These patches serve as input tokens for RWKV-TS. Then, the input tokens proceed into the RWKV backbone, which comprises Time-mixing and Channel-mixing modules. Finally, the output of the last layer of the RWKV backbone is flattened and projected to predict the target.