Table of Contents
Fetching ...

xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories

Maurice Kraus, Felix Divo, Devendra Singh Dhami, Kristian Kersting

TL;DR

xLSTM-Mixer presents a three-stage, memory-efficient approach for multivariate time-series forecasting by first producing a channel-independent linear forecast, refining it with a stacked sLSTM that performs time-variate mixing, and finally reconciling two views (original and reversed embeddings) through a learned projection. The method achieves state-of-the-art long-horizon performance across diverse datasets while using significantly less memory than Transformer-based rivals. It also demonstrates versatility by delivering competitive probabilistic forecasts on GIFT-Eval and strong results as a time-series embedding for classification. Comprehensive ablations and analyses confirm that the combination of time mixing, memory-based cross-variate mixing, and multi-view reconciliation are the core drivers of its robustness and accuracy.

Abstract

Time series data is prevalent across numerous fields, necessitating the development of robust and accurate forecasting models. Capturing patterns both within and between temporal and multivariate components is crucial for reliable predictions. We introduce xLSTM-Mixer, a model designed to effectively integrate temporal sequences, joint time-variate information, and multiple perspectives for robust forecasting. Our approach begins with a linear forecast shared across variates, which is then refined by xLSTM blocks. They serve as key elements for modeling the complex dynamics of challenging time series data. xLSTM-Mixer ultimately reconciles two distinct views to produce the final forecast. Our extensive evaluations demonstrate its superior long-term forecasting performance compared to recent state-of-the-art methods while requiring very little memory. A thorough model analysis provides further insights into its key components and confirms its robustness and effectiveness. This work contributes to the resurgence of recurrent models in forecasting by combining them, for the first time, with mixing architectures.

xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories

TL;DR

xLSTM-Mixer presents a three-stage, memory-efficient approach for multivariate time-series forecasting by first producing a channel-independent linear forecast, refining it with a stacked sLSTM that performs time-variate mixing, and finally reconciling two views (original and reversed embeddings) through a learned projection. The method achieves state-of-the-art long-horizon performance across diverse datasets while using significantly less memory than Transformer-based rivals. It also demonstrates versatility by delivering competitive probabilistic forecasts on GIFT-Eval and strong results as a time-series embedding for classification. Comprehensive ablations and analyses confirm that the combination of time mixing, memory-based cross-variate mixing, and multi-view reconciliation are the core drivers of its robustness and accuracy.

Abstract

Time series data is prevalent across numerous fields, necessitating the development of robust and accurate forecasting models. Capturing patterns both within and between temporal and multivariate components is crucial for reliable predictions. We introduce xLSTM-Mixer, a model designed to effectively integrate temporal sequences, joint time-variate information, and multiple perspectives for robust forecasting. Our approach begins with a linear forecast shared across variates, which is then refined by xLSTM blocks. They serve as key elements for modeling the complex dynamics of challenging time series data. xLSTM-Mixer ultimately reconciles two distinct views to produce the final forecast. Our extensive evaluations demonstrate its superior long-term forecasting performance compared to recent state-of-the-art methods while requiring very little memory. A thorough model analysis provides further insights into its key components and confirms its robustness and effectiveness. This work contributes to the resurgence of recurrent models in forecasting by combining them, for the first time, with mixing architectures.

Paper Structure

This paper contains 29 sections, 8 equations, 8 figures, 14 tables.

Figures (8)

  • Figure 1: xLSTM-Mixer provides excellent forecasts with a very low memory footprint while being sufficiently fast. Details are found in \ref{['sec:exp:model_analysis']}.
  • Figure 2: The xLSTM-Mixer architecture consists of three stages: (1) An initial NLinear forecast assuming channel independence and performing time mixing; (2) subsequent joint mixing, which mixes variate and time information through crucial applications of sLSTM blocks; and (3) view mixing, where the two latent forecast views are reconciled into a coherent final forecast.
  • Figure 3: xLSTM-Mixer provides convincing forecasts. This figure shows example forecasts on the Weather and ETTm1 datasets for multiple models with lookback windows and forecasting horizons fixed at 96. The first panel illustrates the forecast from xLSTM-Mixer, while the second shows the initial forecast extracted before the up-projection step, highlighting the effectiveness of our added components. Comparisons with further baselines are provided for context.
  • Figure 4: Impact of model parameters on forecasting performance.
  • Figure 5: xLSTM-Mixer is statistically significantly better than all baselines except xLSTMTime. Shown is the critical difference diagram for the MSE at a horizon of $H=96$. Horizontal bars connect methods that are not significantly different at $p = 0.05$.
  • ...and 3 more figures