Table of Contents
Fetching ...

SSD4Rec: A Structured State Space Duality Model for Efficient Sequential Recommendation

Haohao Qu, Yifeng Zhang, Liangbo Ning, Wenqi Fan, Qing Li

TL;DR

The paper tackles the inefficiency of existing sequential recommender systems in handling long, variable-length user histories. It introduces SSD4Rec, a Mamba-based backbone that leverages Structured State Space Duality to achieve attention-like modeling with linear complexity in the sequence length $L$, while overcoming padding/truncation via masked, variable-length input construction. A Bi-SSD layer combines forward and backward SSD blocks to capture comprehensive context, stabilized by residual connections, LayerNorm, dropout, and a PFFN, enabling scalable, bidirectional sequence modeling. Empirical results across four benchmark datasets show state-of-the-art accuracy (e.g., NDCG and MRR improvements) and substantial speedups in training and inference relative to both Transformer-based and other Mamba-based baselines. The approach offers practical impact for real-world systems dealing with long-tail interactions, with publicly available code to facilitate adoption and further research.

Abstract

Sequential recommendation methods are crucial in modern recommender systems for their remarkable capability to understand a user's changing interests based on past interactions. However, a significant challenge faced by current methods (e.g., RNN- or Transformer-based models) is to effectively and efficiently capture users' preferences by modeling long behavior sequences, which impedes their various applications like short video platforms where user interactions are numerous. Recently, an emerging architecture named Mamba, built on state space models (SSM) with efficient hardware-aware designs, has showcased the tremendous potential for sequence modeling, presenting a compelling avenue for addressing the challenge effectively. Inspired by this, we propose a novel generic and efficient sequential recommendation backbone, SSD4Rec, which explores the seamless adaptation of Mamba for sequential recommendations. Specifically, SSD4Rec marks the variable- and long-length item sequences with sequence registers and processes the item representations with bidirectional Structured State Space Duality (SSD) blocks. This not only allows for hardware-aware matrix multiplication but also empowers outstanding capabilities in variable-length and long-range sequence modeling. Extensive evaluations on four benchmark datasets demonstrate that the proposed model achieves state-of-the-art performance while maintaining near-linear scalability with user sequence length. Our code is publicly available at https://github.com/ZhangYifeng1995/SSD4Rec.

SSD4Rec: A Structured State Space Duality Model for Efficient Sequential Recommendation

TL;DR

The paper tackles the inefficiency of existing sequential recommender systems in handling long, variable-length user histories. It introduces SSD4Rec, a Mamba-based backbone that leverages Structured State Space Duality to achieve attention-like modeling with linear complexity in the sequence length , while overcoming padding/truncation via masked, variable-length input construction. A Bi-SSD layer combines forward and backward SSD blocks to capture comprehensive context, stabilized by residual connections, LayerNorm, dropout, and a PFFN, enabling scalable, bidirectional sequence modeling. Empirical results across four benchmark datasets show state-of-the-art accuracy (e.g., NDCG and MRR improvements) and substantial speedups in training and inference relative to both Transformer-based and other Mamba-based baselines. The approach offers practical impact for real-world systems dealing with long-tail interactions, with publicly available code to facilitate adoption and further research.

Abstract

Sequential recommendation methods are crucial in modern recommender systems for their remarkable capability to understand a user's changing interests based on past interactions. However, a significant challenge faced by current methods (e.g., RNN- or Transformer-based models) is to effectively and efficiently capture users' preferences by modeling long behavior sequences, which impedes their various applications like short video platforms where user interactions are numerous. Recently, an emerging architecture named Mamba, built on state space models (SSM) with efficient hardware-aware designs, has showcased the tremendous potential for sequence modeling, presenting a compelling avenue for addressing the challenge effectively. Inspired by this, we propose a novel generic and efficient sequential recommendation backbone, SSD4Rec, which explores the seamless adaptation of Mamba for sequential recommendations. Specifically, SSD4Rec marks the variable- and long-length item sequences with sequence registers and processes the item representations with bidirectional Structured State Space Duality (SSD) blocks. This not only allows for hardware-aware matrix multiplication but also empowers outstanding capabilities in variable-length and long-range sequence modeling. Extensive evaluations on four benchmark datasets demonstrate that the proposed model achieves state-of-the-art performance while maintaining near-linear scalability with user sequence length. Our code is publicly available at https://github.com/ZhangYifeng1995/SSD4Rec.
Paper Structure (32 sections, 10 equations, 5 figures, 7 tables)

This paper contains 32 sections, 10 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: The overall framework of the proposed SSD4Rec for variable- and long-length sequential recommendation, which consists of the input construction with variable-length item sequences in a batch and the bidirectional block constructed with State Space Duality (SSD) layers for efficient and effective sequence modeling.
  • Figure 2: Compared to the typical sequential recommendation input with a fixed sequence length, the proposed input construction strategy achieves the seamless adaptation of Mamba to sequential recommendations, thus avoiding information loss and additional computation.
  • Figure 3: The effect of backward weighted indicator $\beta$ under NDCG@10, MRR@10, HR@10, separately.
  • Figure 4: The effect of mask ratio indicator $\rho$ under NDCG@10, MRR@10, HR@10, separately.
  • Figure 5: The effect of max sequence length $L$ on the ML1M and KuaiRand datasets.