Scaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions
Timo Wilm, Philipp Normann, Sophie Baumeister, Paul-Vincent Kobow
TL;DR
TRON addresses scalability and accuracy gaps in session-based recommender systems by extending SASRec with optimized negative sampling and a listwise loss. It uses a hybrid negative sampling scheme combining $k$ negatives from $\mathcal{U}_I$ and $m$ negatives from $\mathcal{F}_I$, with top-$k$ negatives updated per step via $\mathcal{KN}_{s}^{t} = topk(\cdot)$, and adopts the listwise loss $SSM$ to improve ranking. Across Diginetica, Yoochoose, and OTTO, TRON yields higher $Recall@20$ and $MRR@20$ and delivers strong online gains, exemplified by an $18.14\%$ CTR uplift in a live OTTO A/B test while maintaining competitive training speed relative to SASRec. The work provides publicly accessible code and anonymized datasets, highlighting TRON’s practicality for large-scale e-commerce recommendations.
Abstract
This work introduces TRON, a scalable session-based Transformer Recommender using Optimized Negative-sampling. Motivated by the scalability and performance limitations of prevailing models such as SASRec and GRU4Rec+, TRON integrates top-k negative sampling and listwise loss functions to enhance its recommendation accuracy. Evaluations on relevant large-scale e-commerce datasets show that TRON improves upon the recommendation quality of current methods while maintaining training speeds similar to SASRec. A live A/B test yielded an 18.14% increase in click-through rate over SASRec, highlighting the potential of TRON in practical settings. For further research, we provide access to our source code at https://github.com/otto-de/TRON and an anonymized dataset at https://github.com/otto-de/recsys-dataset.
