Table of Contents
Fetching ...

Sample Enrichment via Temporary Operations on Subsequences for Sequential Recommendation

Shu Chen, Jinwei Luo, Weike Pan, Jiangxing Yu, Xin Huang, Zhong Ming

TL;DR

This paper tackles data sparsity in sequential recommendation by revealing a two-step transformation space between observed data and true user preferences. It proposes SETO, a model-agnostic framework that temporarily enriches subsequences via two rationality-constrained operations, Swap and Removal, applied during training to both input and target subsequences. Empirical results across six backbone models and multiple real-world datasets (including a large industry dataset) demonstrate consistent, significant gains without modifying model architectures or objectives. The contribution delivers a lightweight, broadly applicable data-augmentation approach with practical impact for improving recommendation quality in diverse settings.

Abstract

Sequential recommendation leverages interaction sequences to predict forthcoming user behaviors, crucial for crafting personalized recommendations. However, the true preferences of a user are inherently complex and high-dimensional, while the observed data is merely a simplified and low-dimensional projection of the rich preferences, which often leads to prevalent issues like data sparsity and inaccurate model training. To learn true preferences from the sparse data, most existing works endeavor to introduce some extra information or design some ingenious models. Although they have shown to be effective, extra information usually increases the cost of data collection, and complex models may result in difficulty in deployment. Innovatively, we avoid the use of extra information or alterations to the model; instead, we fill the transformation space between the observed data and the underlying preferences with randomness. Specifically, we propose a novel model-agnostic and highly generic framework for sequential recommendation called sample enrichment via temporary operations on subsequences (SETO), which temporarily and separately enriches the transformation space via sequence enhancement operations with rationality constraints in training. The transformation space not only exists in the process from input samples to preferences but also in preferences to target samples. We highlight our SETO's effectiveness and versatility over multiple representative and state-of-the-art sequential recommendation models (including six single-domain sequential models and two cross-domain sequential models) across multiple real-world datasets (including three single-domain datasets, three cross-domain datasets and a large-scale industry dataset).

Sample Enrichment via Temporary Operations on Subsequences for Sequential Recommendation

TL;DR

This paper tackles data sparsity in sequential recommendation by revealing a two-step transformation space between observed data and true user preferences. It proposes SETO, a model-agnostic framework that temporarily enriches subsequences via two rationality-constrained operations, Swap and Removal, applied during training to both input and target subsequences. Empirical results across six backbone models and multiple real-world datasets (including a large industry dataset) demonstrate consistent, significant gains without modifying model architectures or objectives. The contribution delivers a lightweight, broadly applicable data-augmentation approach with practical impact for improving recommendation quality in diverse settings.

Abstract

Sequential recommendation leverages interaction sequences to predict forthcoming user behaviors, crucial for crafting personalized recommendations. However, the true preferences of a user are inherently complex and high-dimensional, while the observed data is merely a simplified and low-dimensional projection of the rich preferences, which often leads to prevalent issues like data sparsity and inaccurate model training. To learn true preferences from the sparse data, most existing works endeavor to introduce some extra information or design some ingenious models. Although they have shown to be effective, extra information usually increases the cost of data collection, and complex models may result in difficulty in deployment. Innovatively, we avoid the use of extra information or alterations to the model; instead, we fill the transformation space between the observed data and the underlying preferences with randomness. Specifically, we propose a novel model-agnostic and highly generic framework for sequential recommendation called sample enrichment via temporary operations on subsequences (SETO), which temporarily and separately enriches the transformation space via sequence enhancement operations with rationality constraints in training. The transformation space not only exists in the process from input samples to preferences but also in preferences to target samples. We highlight our SETO's effectiveness and versatility over multiple representative and state-of-the-art sequential recommendation models (including six single-domain sequential models and two cross-domain sequential models) across multiple real-world datasets (including three single-domain datasets, three cross-domain datasets and a large-scale industry dataset).
Paper Structure (28 sections, 5 equations, 6 figures, 7 tables, 2 algorithms)

This paper contains 28 sections, 5 equations, 6 figures, 7 tables, 2 algorithms.

Figures (6)

  • Figure 1: An example that shows no one-to-one correspondence between the observed data and the underlying preferences but a certain transformation space (a). Two-dimensional planar points represent the observed data and three-dimensional spatial points represent the true user preferences (b).
  • Figure 2: An illustration of sample enhancement in training, where a dashed box indicates a sample sequence that was altered (i.e., altering the position of items in the sequence or the length of the sequence), and a solid box indicates an original sample sequence.
  • Figure 3: The overview of our SETO on the left, and two operations illustrated on the right. $S'$ is the new sequence after those operations are applied on $S$. We do the temporary operations with rationality constraints on the original input and target subsequences in each training iteration, thus enriching diverse training samples.
  • Figure 4: Recommendation performance of SASRec via our SETO(S) using different values of $scope$ and $\rho$ on three datasets. Note that the y-axis starting point does not start from 0.
  • Figure 5: Training loss curve of different methods on different datesets.
  • ...and 1 more figures