Table of Contents
Fetching ...

Session-based Recommender Systems: User Interest as a Stochastic Process in the Latent Space

Klaudia Balcer, Piotr Lipinski

TL;DR

The paper addresses data uncertainty, popularity bias, and exposure bias in session-based recommender systems by treating user interest as a stochastic process in latent space and implementing a model-agnostic component. It introduces three elements: debiasing item embeddings via embedding-uniformity on a sphere, modeling dense user interest with a von Mises–Fisher distribution, and extending exposure through fake-target sampling with a modified loss and an added regularizer. Experiments on Diginetica and YooChoose (including biased variants) show improved exposure and reduced embedding-based bias, with generalization benefits that depend on dataset bias. The approach provides a flexible framework to jointly mitigate key SBRS biases while remaining compatible with existing architectures like SR-GNN.

Abstract

This paper jointly addresses the problem of data uncertainty, popularity bias, and exposure bias in session-based recommender systems. We study the symptoms of this bias both in item embeddings and in recommendations. We propose treating user interest as a stochastic process in the latent space and providing a model-agnostic implementation of this mathematical concept. The proposed stochastic component consists of elements: debiasing item embeddings with regularization for embedding uniformity, modeling dense user interest from session prefixes, and introducing fake targets in the data to simulate extended exposure. We conducted computational experiments on two popular benchmark datasets, Diginetica and YooChoose 1/64, as well as several modifications of the YooChoose dataset with different ratios of popular items. The results show that the proposed approach allows us to mitigate the challenges mentioned.

Session-based Recommender Systems: User Interest as a Stochastic Process in the Latent Space

TL;DR

The paper addresses data uncertainty, popularity bias, and exposure bias in session-based recommender systems by treating user interest as a stochastic process in latent space and implementing a model-agnostic component. It introduces three elements: debiasing item embeddings via embedding-uniformity on a sphere, modeling dense user interest with a von Mises–Fisher distribution, and extending exposure through fake-target sampling with a modified loss and an added regularizer. Experiments on Diginetica and YooChoose (including biased variants) show improved exposure and reduced embedding-based bias, with generalization benefits that depend on dataset bias. The approach provides a flexible framework to jointly mitigate key SBRS biases while remaining compatible with existing architectures like SR-GNN.

Abstract

This paper jointly addresses the problem of data uncertainty, popularity bias, and exposure bias in session-based recommender systems. We study the symptoms of this bias both in item embeddings and in recommendations. We propose treating user interest as a stochastic process in the latent space and providing a model-agnostic implementation of this mathematical concept. The proposed stochastic component consists of elements: debiasing item embeddings with regularization for embedding uniformity, modeling dense user interest from session prefixes, and introducing fake targets in the data to simulate extended exposure. We conducted computational experiments on two popular benchmark datasets, Diginetica and YooChoose 1/64, as well as several modifications of the YooChoose dataset with different ratios of popular items. The results show that the proposed approach allows us to mitigate the challenges mentioned.

Paper Structure

This paper contains 18 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Points represent 2D random embeddings. Points marked with 'x' (one per plot) represent the predicted session embeddings. Dot-products between the session embeddings and the item embeddings have been calculated and marked with the colour. Products closest to red would be recommended
  • Figure 2: Similarity to closest item; histogram for all items and a violin plot (KDE on Y axis) for items grouped by popularity
  • Figure 3: The circle is the dense spherical space of possible user interest. The latent user interest is marked with 'x'. The colours of the circle correspond to the alignment with user interest. The dots and stars are the representations of all considered items. The user has been exposed to items shifted outside the circle, those shifted inside remain unseen. The green star is the target present in the data. The magenta stars are possible, unseen targets, which we would like to discover by sampling fake targets
  • Figure 4: Histogram of similarity with the closest item (calculated with cosine similarity between item embeddings). Each column corresponds to a model (noisy SR-GNN, NISER, SR-GNN, TAGNN) and each row to a dataset