Table of Contents
Fetching ...

Efficient User Sequence Learning for Online Services via Compressed Graph Neural Networks

Yucheng Wu, Liyue Chen, Yu Cheng, Shuai Chen, Jinyu Xu, Leye Wang

TL;DR

ECSeq tackles the efficiency bottleneck of applying graph neural networks to large-scale online user sequences by introducing a unified graph-compression framework for sequence relation modeling. It decomposes the problem into sequence embedding extraction and relation modeling on a compressed graph of representative sequences, enabling scalable training and ultra-fast online inference. A key contribution is the integration of multiple graph-compression strategies with an auxiliary training objective to align real and compressed nodes, delivering interpretability via case-based reasoning and maintaining plug-and-play compatibility with pre-trained sequence models. Empirical results on fraud detection and user mobility show ECSeq yields roughly a 5% gain in R@P0.9 over strong baselines like LSTM, while keeping inference time around 1e-4 seconds per sample and requiring only tens of seconds of additional training for 100k+ sequences.

Abstract

Learning representations of user behavior sequences is crucial for various online services, such as online fraudulent transaction detection mechanisms. Graph Neural Networks (GNNs) have been extensively applied to model sequence relationships, and extract information from similar sequences. While user behavior sequence data volume is usually huge for online applications, directly applying GNN models may lead to substantial computational overhead during both the training and inference stages and make it challenging to meet real-time requirements for online services. In this paper, we leverage graph compression techniques to alleviate the efficiency issue. Specifically, we propose a novel unified framework called ECSeq, to introduce graph compression techniques into relation modeling for user sequence representation learning. The key module of ECSeq is sequence relation modeling, which explores relationships among sequences to enhance sequence representation learning, and employs graph compression algorithms to achieve high efficiency and scalability. ECSeq also exhibits plug-and-play characteristics, seamlessly augmenting pre-trained sequence representation models without modifications. Empirical experiments on both sequence classification and regression tasks demonstrate the effectiveness of ECSeq. Specifically, with an additional training time of tens of seconds in total on 100,000+ sequences and inference time preserved within $10^{-4}$ seconds/sample, ECSeq improves the prediction R@P$_{0.9}$ of the widely used LSTM by $\sim 5\%$.

Efficient User Sequence Learning for Online Services via Compressed Graph Neural Networks

TL;DR

ECSeq tackles the efficiency bottleneck of applying graph neural networks to large-scale online user sequences by introducing a unified graph-compression framework for sequence relation modeling. It decomposes the problem into sequence embedding extraction and relation modeling on a compressed graph of representative sequences, enabling scalable training and ultra-fast online inference. A key contribution is the integration of multiple graph-compression strategies with an auxiliary training objective to align real and compressed nodes, delivering interpretability via case-based reasoning and maintaining plug-and-play compatibility with pre-trained sequence models. Empirical results on fraud detection and user mobility show ECSeq yields roughly a 5% gain in R@P0.9 over strong baselines like LSTM, while keeping inference time around 1e-4 seconds per sample and requiring only tens of seconds of additional training for 100k+ sequences.

Abstract

Learning representations of user behavior sequences is crucial for various online services, such as online fraudulent transaction detection mechanisms. Graph Neural Networks (GNNs) have been extensively applied to model sequence relationships, and extract information from similar sequences. While user behavior sequence data volume is usually huge for online applications, directly applying GNN models may lead to substantial computational overhead during both the training and inference stages and make it challenging to meet real-time requirements for online services. In this paper, we leverage graph compression techniques to alleviate the efficiency issue. Specifically, we propose a novel unified framework called ECSeq, to introduce graph compression techniques into relation modeling for user sequence representation learning. The key module of ECSeq is sequence relation modeling, which explores relationships among sequences to enhance sequence representation learning, and employs graph compression algorithms to achieve high efficiency and scalability. ECSeq also exhibits plug-and-play characteristics, seamlessly augmenting pre-trained sequence representation models without modifications. Empirical experiments on both sequence classification and regression tasks demonstrate the effectiveness of ECSeq. Specifically, with an additional training time of tens of seconds in total on 100,000+ sequences and inference time preserved within seconds/sample, ECSeq improves the prediction R@P of the widely used LSTM by .
Paper Structure (28 sections, 13 equations, 7 figures, 7 tables)

This paper contains 28 sections, 13 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Up: A user behavior sequence example consists of three events and each event has several fields. Down: An example of user behavior sequence for online shopping.
  • Figure 1: Summary of typical graph compression methods. $N$: number of nodes, $M$: number of edges, $D$: dimension of node features, $K$: number of clusters/compressed nodes, $c$: some absolute constant. Traceable: whether the source of the compressed nodes is known; Configurable: whether the compression method can assign separate compressed node quantities for each category.
  • Figure 2: Overview of ECSeq. Firstly, the sequence embedding extraction module meticulously transforms sequence information into a one-dimensional feature vector. Then, the sequence relation modeling module explores and leverages relationships among sequences to enhance the sequence representation, employing an appropriate graph compression technique to mitigate computational overhead and improve inference efficiency.
  • Figure 3: Illustration of diverse graph compression methods.
  • Figure 4: ECSeq Training Procedure
  • ...and 2 more figures