Table of Contents
Fetching ...

EchoMamba4Rec: Harmonizing Bidirectional State Space Models with Spectral Filtering for Advanced Sequential Recommendation

Yuda Wang, Xuxin He, Shengxin Zhu

TL;DR

The paper addresses the scalability limitations of attention-based sequential recommender systems, where $O(n^2)$ time complexity for long histories impedes practical deployment. It introduces EchoMamba4Rec, a bi-directional, state-space-model–based framework that employs a bi-directional EchoMamba block, FFT-based spectral filtering, and Gate Linear Units to achieve linear-time inference while capturing long-range dependencies. The authors provide a detailed methodology, including embedding, a spectral filter layer, selective SSM blocks with HiPPO-based initialization, and a bidirectional prediction layer, and demonstrate superior recommendation accuracy and competitive efficiency across MovieLens-1M and Amazon datasets compared with RNNs, Transformers, and Mamba variants. This approach offers scalable, personalized sequential recommendations suitable for large-scale, real-world systems, with potential for real-time deployment and broader cross-domain applicability.

Abstract

Predicting user preferences and sequential dependencies based on historical behavior is the core goal of sequential recommendation. Although attention-based models have shown effectiveness in this field, they often struggle with inference inefficiency due to the quadratic computational complexity inherent in attention mechanisms, especially with long-range behavior sequences. Drawing inspiration from the recent advancements of state space models (SSMs) in control theory, which provide a robust framework for modeling and controlling dynamic systems, we introduce EchoMamba4Rec. Control theory emphasizes the use of SSMs for managing long-range dependencies and maintaining inferential efficiency through structured state matrices. EchoMamba4Rec leverages these control relationships in sequential recommendation and integrates bi-directional processing with frequency-domain filtering to capture complex patterns and dependencies in user interaction data more effectively. Our model benefits from the ability of state space models (SSMs) to learn and perform parallel computations, significantly enhancing computational efficiency and scalability. It features a bi-directional Mamba module that incorporates both forward and reverse Mamba components, leveraging information from both past and future interactions. Additionally, a filter layer operates in the frequency domain using learnable Fast Fourier Transform (FFT) and learnable filters, followed by an inverse FFT to refine item embeddings and reduce noise. We also integrate Gate Linear Units (GLU) to dynamically control information flow, enhancing the model's expressiveness and training stability. Experimental results demonstrate that EchoMamba significantly outperforms existing models, providing more accurate and personalized recommendations.

EchoMamba4Rec: Harmonizing Bidirectional State Space Models with Spectral Filtering for Advanced Sequential Recommendation

TL;DR

The paper addresses the scalability limitations of attention-based sequential recommender systems, where time complexity for long histories impedes practical deployment. It introduces EchoMamba4Rec, a bi-directional, state-space-model–based framework that employs a bi-directional EchoMamba block, FFT-based spectral filtering, and Gate Linear Units to achieve linear-time inference while capturing long-range dependencies. The authors provide a detailed methodology, including embedding, a spectral filter layer, selective SSM blocks with HiPPO-based initialization, and a bidirectional prediction layer, and demonstrate superior recommendation accuracy and competitive efficiency across MovieLens-1M and Amazon datasets compared with RNNs, Transformers, and Mamba variants. This approach offers scalable, personalized sequential recommendations suitable for large-scale, real-world systems, with potential for real-time deployment and broader cross-domain applicability.

Abstract

Predicting user preferences and sequential dependencies based on historical behavior is the core goal of sequential recommendation. Although attention-based models have shown effectiveness in this field, they often struggle with inference inefficiency due to the quadratic computational complexity inherent in attention mechanisms, especially with long-range behavior sequences. Drawing inspiration from the recent advancements of state space models (SSMs) in control theory, which provide a robust framework for modeling and controlling dynamic systems, we introduce EchoMamba4Rec. Control theory emphasizes the use of SSMs for managing long-range dependencies and maintaining inferential efficiency through structured state matrices. EchoMamba4Rec leverages these control relationships in sequential recommendation and integrates bi-directional processing with frequency-domain filtering to capture complex patterns and dependencies in user interaction data more effectively. Our model benefits from the ability of state space models (SSMs) to learn and perform parallel computations, significantly enhancing computational efficiency and scalability. It features a bi-directional Mamba module that incorporates both forward and reverse Mamba components, leveraging information from both past and future interactions. Additionally, a filter layer operates in the frequency domain using learnable Fast Fourier Transform (FFT) and learnable filters, followed by an inverse FFT to refine item embeddings and reduce noise. We also integrate Gate Linear Units (GLU) to dynamically control information flow, enhancing the model's expressiveness and training stability. Experimental results demonstrate that EchoMamba significantly outperforms existing models, providing more accurate and personalized recommendations.
Paper Structure (26 sections, 22 equations, 4 figures, 3 tables)

This paper contains 26 sections, 22 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Mamba blockGuDao2023
  • Figure 2: EchoMamba4Rec. The process starts with embedding user information using an recommendation embedding layer. Next, a filter layer was utilized to extracts essential sequence information. This filtered data is then processed by the bi-directional EchoMamba block, handling sequences in both forward and reverse directions. A Gated Linear Unit (GLU) is used to dynamically control information flow, enhancing the model's expressiveness and stability. The final step involves combining and normalizing the processed data before generating the sequential recommendation output. Compared to Mamba4Rec, our model places greater emphasis on extracting sequence features while reducing noise, thereby improving model accuracy and robustness.
  • Figure 3: Bi-Mamba4Recliang2024bi
  • Figure 4: Bi-Mambaformer4Recliang2024bixu2024integrating