Table of Contents
Fetching ...

EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention

Zhen Tian, Wayne Xin Zhao, Changwang Zhang, Xin Zhao, Zhongrui Ma, Ji-Rong Wen

TL;DR

EulerFormer addresses the challenge of modeling sequential user behavior by unifying semantic and positional information in a complex vector space through an Euler transformation and an adaptive rotation mechanism. It introduces a novel complex vector attention and a phase-contrastive objective to improve isotropy of contextual representations, enabling robust handling of varied interaction patterns. Empirical results on four public datasets show consistent gains over strong baselines and existing positional encoding methods, with competitive efficiency. By subsuming RoPE as a special case and providing controllable long-range decay, EulerFormer advances sequential recommender systems with a general, flexible framework.

Abstract

To capture user preference, transformer models have been widely applied to model sequential user behavior data. The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a sequence. Due to the permutation-equivariant nature, positional encoding is used to enhance the attention between token representations. In this setting, the pairwise attention scores can be derived by both semantic difference and positional difference. However, prior studies often model the two kinds of difference measurements in different ways, which potentially limits the expressive capacity of sequence modeling. To address this issue, this paper proposes a novel transformer variant with complex vector attention, named EulerFormer, which provides a unified theoretical framework to formulate both semantic difference and positional difference. The EulerFormer involves two key technical improvements. First, it employs a new transformation function for efficiently transforming the sequence tokens into polar-form complex vectors using Euler's formula, enabling the unified modeling of both semantic and positional information in a complex rotation form.Secondly, it develops a differential rotation mechanism, where the semantic rotation angles can be controlled by an adaptation function, enabling the adaptive integration of the semantic and positional information according to the semantic contexts.Furthermore, a phase contrastive learning task is proposed to improve the isotropy of contextual representations in EulerFormer. Our theoretical framework possesses a high degree of completeness and generality. It is more robust to semantic variations and possesses moresuperior theoretical properties in principle. Extensive experiments conducted on four public datasets demonstrate the effectiveness and efficiency of our approach.

EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention

TL;DR

EulerFormer addresses the challenge of modeling sequential user behavior by unifying semantic and positional information in a complex vector space through an Euler transformation and an adaptive rotation mechanism. It introduces a novel complex vector attention and a phase-contrastive objective to improve isotropy of contextual representations, enabling robust handling of varied interaction patterns. Empirical results on four public datasets show consistent gains over strong baselines and existing positional encoding methods, with competitive efficiency. By subsuming RoPE as a special case and providing controllable long-range decay, EulerFormer advances sequential recommender systems with a general, flexible framework.

Abstract

To capture user preference, transformer models have been widely applied to model sequential user behavior data. The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a sequence. Due to the permutation-equivariant nature, positional encoding is used to enhance the attention between token representations. In this setting, the pairwise attention scores can be derived by both semantic difference and positional difference. However, prior studies often model the two kinds of difference measurements in different ways, which potentially limits the expressive capacity of sequence modeling. To address this issue, this paper proposes a novel transformer variant with complex vector attention, named EulerFormer, which provides a unified theoretical framework to formulate both semantic difference and positional difference. The EulerFormer involves two key technical improvements. First, it employs a new transformation function for efficiently transforming the sequence tokens into polar-form complex vectors using Euler's formula, enabling the unified modeling of both semantic and positional information in a complex rotation form.Secondly, it develops a differential rotation mechanism, where the semantic rotation angles can be controlled by an adaptation function, enabling the adaptive integration of the semantic and positional information according to the semantic contexts.Furthermore, a phase contrastive learning task is proposed to improve the isotropy of contextual representations in EulerFormer. Our theoretical framework possesses a high degree of completeness and generality. It is more robust to semantic variations and possesses moresuperior theoretical properties in principle. Extensive experiments conducted on four public datasets demonstrate the effectiveness and efficiency of our approach.
Paper Structure (27 sections, 16 equations, 8 figures, 5 tables)

This paper contains 27 sections, 16 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: The overall architecture of EulerFormer.
  • Figure 2: Performance analysis for different sequence lengths.
  • Figure 3: Visualization of token difference distributions at a certain position within different Transformer layers.
  • Figure 5: Visualization of the phase distributions with (w) and without (w/o) phase constrastive learning (PCL).
  • Figure 6: Performance comparison w.r.t. different $\epsilon$ and $\tau$.
  • ...and 3 more figures