Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions

Timo Wilm; Philipp Normann; Sophie Baumeister; Paul-Vincent Kobow

Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions

Timo Wilm, Philipp Normann, Sophie Baumeister, Paul-Vincent Kobow

TL;DR

TRON addresses scalability and accuracy gaps in session-based recommender systems by extending SASRec with optimized negative sampling and a listwise loss. It uses a hybrid negative sampling scheme combining $k$ negatives from $\mathcal{U}_I$ and $m$ negatives from $\mathcal{F}_I$, with top-$k$ negatives updated per step via $\mathcal{KN}_{s}^{t} = topk(\cdot)$, and adopts the listwise loss $SSM$ to improve ranking. Across Diginetica, Yoochoose, and OTTO, TRON yields higher $Recall@20$ and $MRR@20$ and delivers strong online gains, exemplified by an $18.14\%$ CTR uplift in a live OTTO A/B test while maintaining competitive training speed relative to SASRec. The work provides publicly accessible code and anonymized datasets, highlighting TRON’s practicality for large-scale e-commerce recommendations.

Abstract

This work introduces TRON, a scalable session-based Transformer Recommender using Optimized Negative-sampling. Motivated by the scalability and performance limitations of prevailing models such as SASRec and GRU4Rec+, TRON integrates top-k negative sampling and listwise loss functions to enhance its recommendation accuracy. Evaluations on relevant large-scale e-commerce datasets show that TRON improves upon the recommendation quality of current methods while maintaining training speeds similar to SASRec. A live A/B test yielded an 18.14% increase in click-through rate over SASRec, highlighting the potential of TRON in practical settings. For further research, we provide access to our source code at https://github.com/otto-de/TRON and an anonymized dataset at https://github.com/otto-de/recsys-dataset.

Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions

TL;DR

negatives from

and

negatives from

, with top-

negatives updated per step via

, and adopts the listwise loss

to improve ranking. Across Diginetica, Yoochoose, and OTTO, TRON yields higher

and

and delivers strong online gains, exemplified by an

CTR uplift in a live OTTO A/B test while maintaining competitive training speed relative to SASRec. The work provides publicly accessible code and anonymized datasets, highlighting TRON’s practicality for large-scale e-commerce recommendations.

Abstract

Paper Structure (8 sections, 2 figures, 2 tables)

This paper contains 8 sections, 2 figures, 2 tables.

Introduction
Methods
Negative Sampling
Loss Functions
Experimental Setup
Results
Conclusion
Speaker Bio

Figures (2)

Figure 1: Offline evaluation results on our private OTTO dataset used for the online A/B test of our three groups.
Figure 2: Online results of our A/B test relative to the SASRec baseline. The error bars indicate the 95% confidence interval.

Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions

TL;DR

Abstract

Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions

Authors

TL;DR

Abstract

Table of Contents

Figures (2)