Table of Contents
Fetching ...

Barlow Twins for Sequential Recommendation

Ivan Razvorotnev, Marina Munkhoeva, Evgeny Frolov

TL;DR

BT-SR presents a non-contrastive, Barlow Twins-based regularization for Transformer-based sequential recommendation, addressing sparsity and popularity bias while enabling controllable accuracy-diversity tradeoffs via a single hyperparameter $\alpha$. By pairing supervised augmentations that align sequences converging to the same intent with a redundancy-reduction objective, BT-SR yields sharper, more diverse user embeddings without negative sampling. Empirical results across five public datasets show consistent gains in next-item accuracy, coverage of long-tail items, and calibration over strong baselines, including contrastive-learning methods. The approach is end-to-end and scalable, offering a practical pathway to fairer, higher-performing sequential recommender systems.

Abstract

Sequential recommendation models must navigate sparse interaction data popularity bias and conflicting objectives like accuracy versus diversity While recent contrastive selfsupervised learning SSL methods offer improved accuracy they come with tradeoffs large batch requirements reliance on handcrafted augmentations and negative sampling that can reinforce popularity bias In this paper we introduce BT-SR a novel noncontrastive SSL framework that integrates the Barlow Twins redundancyreduction principle into a Transformerbased nextitem recommender BTSR learns embeddings that align users with similar shortterm behaviors while preserving longterm distinctionswithout requiring negative sampling or artificial perturbations This structuresensitive alignment allows BT-SR to more effectively recognize emerging user intent and mitigate the influence of noisy historical context Our experiments on five public benchmarks demonstrate that BTSR consistently improves nextitem prediction accuracy and significantly enhances longtail item coverage and recommendation calibration Crucially we show that a single hyperparameter can control the accuracydiversity tradeoff enabling practitioners to adapt recommendations to specific application needs

Barlow Twins for Sequential Recommendation

TL;DR

BT-SR presents a non-contrastive, Barlow Twins-based regularization for Transformer-based sequential recommendation, addressing sparsity and popularity bias while enabling controllable accuracy-diversity tradeoffs via a single hyperparameter . By pairing supervised augmentations that align sequences converging to the same intent with a redundancy-reduction objective, BT-SR yields sharper, more diverse user embeddings without negative sampling. Empirical results across five public datasets show consistent gains in next-item accuracy, coverage of long-tail items, and calibration over strong baselines, including contrastive-learning methods. The approach is end-to-end and scalable, offering a practical pathway to fairer, higher-performing sequential recommender systems.

Abstract

Sequential recommendation models must navigate sparse interaction data popularity bias and conflicting objectives like accuracy versus diversity While recent contrastive selfsupervised learning SSL methods offer improved accuracy they come with tradeoffs large batch requirements reliance on handcrafted augmentations and negative sampling that can reinforce popularity bias In this paper we introduce BT-SR a novel noncontrastive SSL framework that integrates the Barlow Twins redundancyreduction principle into a Transformerbased nextitem recommender BTSR learns embeddings that align users with similar shortterm behaviors while preserving longterm distinctionswithout requiring negative sampling or artificial perturbations This structuresensitive alignment allows BT-SR to more effectively recognize emerging user intent and mitigate the influence of noisy historical context Our experiments on five public benchmarks demonstrate that BTSR consistently improves nextitem prediction accuracy and significantly enhances longtail item coverage and recommendation calibration Crucially we show that a single hyperparameter can control the accuracydiversity tradeoff enabling practitioners to adapt recommendations to specific application needs

Paper Structure

This paper contains 18 sections, 12 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: HR@1 (left) and HR@10 (right) metrics for the three item-popularity buckets across three datasets.
  • Figure 2: (left) Comparison of score density distributions for positive and negative candidate pairs across three datasets. We quantify model’s confidence in distinguishing relevant candidates by the histogram overlap factor (Overlap)—lower values indicate better separation. (right) Singular value spectra of the sequence embeddings for each dataset, annotated with their computed effective ranks (in legend), illustrate the effective dimensionality of the learned representations.
  • Figure 3: Parameter sensitivity wrt $\alpha$ for YELP dataset. Left: common metrics, Right: item-popularity bucket metrics. Picture demonstrates controllability of recommendations via hyperparameter.
  • Figure 4: Parameter sensitivity wrt $\alpha$ for MovieLens dataset. Left: common metrics, Right: item-popularity bucket metrics. Picture demonstrates controllability of recommendations via hyperparameter.
  • Figure 5: Performance comparison (HR@1 and HR@10) for the BT-SR method on the Yelp dataset under two regimes: $\alpha=0.1$ favors popular items (high HR@1), while $\alpha=0.4$ promotes diverse, less popular items early in the list, boosting personalization and maintaining high overall accuracy (HR@10).
  • ...and 4 more figures