Table of Contents
Fetching ...

FuXi-Linear: Unleashing the Power of Linear Attention in Long-term Time-aware Sequential Recommendation

Yufei Ye, Wei Guo, Hao Wang, Luankang Zhang, Heng Chang, Hong Zhu, Yuyang Ye, Yong Liu, Defu Lian, Enhong Chen

TL;DR

FuXi-Linear is proposed, a linear-complexity model designed for efficient long-sequence recommendation that outperforms state-of-the-art models in recommendation quality, while achieving up to 10 times speedup in the prefill stage and up to 21 times speedup in the decode stage compared to competitive baselines.

Abstract

Modern recommendation systems primarily rely on attention mechanisms with quadratic complexity, which limits their ability to handle long user sequences and slows down inference. While linear attention is a promising alternative, existing research faces three critical challenges: (1) temporal signals are often overlooked or integrated via naive coupling that causes mutual interference between temporal and semantic signals while neglecting behavioral periodicity; (2) insufficient positional information provided by existing linear frameworks; and (3) a primary focus on short sequences and shallow architectures. To address these issues, we propose FuXi-Linear, a linear-complexity model designed for efficient long-sequence recommendation. Our approach introduces two key components: (1) a Temporal Retention Channel that independently computes periodic attention weights using temporal data, preventing crosstalk between temporal and semantic signals; (2) a Linear Positional Channel that integrates positional information through learnable kernels within linear complexity. Moreover, we demonstrate that FuXi-Linear exhibits a robust power-law scaling property at a thousand-length scale, a characteristic largely unexplored in prior linear recommendation studies. Extensive experiments on sequences of several thousand tokens demonstrate that FuXi-Linear outperforms state-of-the-art models in recommendation quality, while achieving up to 10$\times$ speedup in the prefill stage and up to 21$\times$ speedup in the decode stage compared to competitive baselines. Our code has been released in a public repository https://github.com/USTC-StarTeam/fuxi-linear.

FuXi-Linear: Unleashing the Power of Linear Attention in Long-term Time-aware Sequential Recommendation

TL;DR

FuXi-Linear is proposed, a linear-complexity model designed for efficient long-sequence recommendation that outperforms state-of-the-art models in recommendation quality, while achieving up to 10 times speedup in the prefill stage and up to 21 times speedup in the decode stage compared to competitive baselines.

Abstract

Modern recommendation systems primarily rely on attention mechanisms with quadratic complexity, which limits their ability to handle long user sequences and slows down inference. While linear attention is a promising alternative, existing research faces three critical challenges: (1) temporal signals are often overlooked or integrated via naive coupling that causes mutual interference between temporal and semantic signals while neglecting behavioral periodicity; (2) insufficient positional information provided by existing linear frameworks; and (3) a primary focus on short sequences and shallow architectures. To address these issues, we propose FuXi-Linear, a linear-complexity model designed for efficient long-sequence recommendation. Our approach introduces two key components: (1) a Temporal Retention Channel that independently computes periodic attention weights using temporal data, preventing crosstalk between temporal and semantic signals; (2) a Linear Positional Channel that integrates positional information through learnable kernels within linear complexity. Moreover, we demonstrate that FuXi-Linear exhibits a robust power-law scaling property at a thousand-length scale, a characteristic largely unexplored in prior linear recommendation studies. Extensive experiments on sequences of several thousand tokens demonstrate that FuXi-Linear outperforms state-of-the-art models in recommendation quality, while achieving up to 10 speedup in the prefill stage and up to 21 speedup in the decode stage compared to competitive baselines. Our code has been released in a public repository https://github.com/USTC-StarTeam/fuxi-linear.
Paper Structure (40 sections, 26 equations, 4 figures, 6 tables)

This paper contains 40 sections, 26 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The overall architecture of FuXi-Linear.
  • Figure 2: The efficiency comparison across different sequence lengths.
  • Figure 3: Scaling performance of FuXi-Linear with respect to model size on Kuairand-27K.
  • Figure 4: Hyperparameter study on FuXi-Linear performance.