ARMA Cell: A Modular and Effective Approach for Neural Autoregressive Modeling
Philipp Schiele, Christoph Berninger, David Rügamer
TL;DR
This work introduces ARMA cell, a modular neural-network unit that encodes ARMA($p$,$q$) dynamics within recurrent architectures to bridge classical time-series modeling and deep learning. It extends to VARMA and ConvARMA for multivariate and tensor-variate data, enabling end-to-end learning with regular neural components. Empirical results show that ARMA cells achieve competitive or superior performance compared to LSTM/GRU baselines across univariate, multivariate, and tensor-valued time series, while offering greater training stability and modularity. The proposed framework supports easy integration into existing architectures and provides a practical path for hybrid linear and nonlinear modeling with robust optimization. The paper also offers an open-source TensorFlow implementation to foster adoption and systematic comparisons in practice.
Abstract
The autoregressive moving average (ARMA) model is a classical, and arguably one of the most studied approaches to model time series data. It has compelling theoretical properties and is widely used among practitioners. More recent deep learning approaches popularize recurrent neural networks (RNNs) and, in particular, Long Short-Term Memory (LSTM) cells that have become one of the best performing and most common building blocks in neural time series modeling. While advantageous for time series data or sequences with long-term effects, complex RNN cells are not always a must and can sometimes even be inferior to simpler recurrent approaches. In this work, we introduce the ARMA cell, a simpler, modular, and effective approach for time series modeling in neural networks. This cell can be used in any neural network architecture where recurrent structures are present and naturally handles multivariate time series using vector autoregression. We also introduce the ConvARMA cell as a natural successor for spatially-correlated time series. Our experiments show that the proposed methodology is competitive with popular alternatives in terms of performance while being more robust and compelling due to its simplicity
