Table of Contents
Fetching ...

FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning

Sejoon Oh, Berk Ustun, Julian McAuley, Srijan Kumar

TL;DR

This work tackles the problem of instability in sequential recommender outputs caused by perturbations to training data. It introduces FINEST, a model-agnostic fine-tuning approach that first derives reference rank lists from a base model, then simulates perturbations during fine-tuning and enforces rank preservation on the top-$K$ items to stabilize rankings without sacrificing next-item accuracy. The key contributions are the introduction of a perturbation-driven fine-tuning framework, a rank-preserving regularization designed for scalable top-$K$ preservation, and extensive experiments showing improved rank-list stability (via $RLS$ metrics like $RBO$ and Top-$K$ Jaccard) while maintaining strong predictive performance across three real-world datasets and multiple base models. The results demonstrate practical impact for deploying robust, stable recommender systems in settings where small data perturbations could otherwise cause large ranking shifts. The work also discusses limitations and future directions, including extending to other perturbation types and non-sequential or multimodal settings.

Abstract

Modern recommender systems may output considerably different recommendations due to small perturbations in the training data. Changes in the data from a single user will alter the recommendations as well as the recommendations of other users. In applications like healthcare, housing, and finance, this sensitivity can have adverse effects on user experience. We propose a method to stabilize a given recommender system against such perturbations. This is a challenging task due to (1) the lack of a ``reference'' rank list that can be used to anchor the outputs; and (2) the computational challenges in ensuring the stability of rank lists with respect to all possible perturbations of training data. Our method, FINEST, overcomes these challenges by obtaining reference rank lists from a given recommendation model and then fine-tuning the model under simulated perturbation scenarios with rank-preserving regularization on sampled items. Our experiments on real-world datasets demonstrate that FINEST can ensure that recommender models output stable recommendations under a wide range of different perturbations without compromising next-item prediction accuracy.

FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning

TL;DR

This work tackles the problem of instability in sequential recommender outputs caused by perturbations to training data. It introduces FINEST, a model-agnostic fine-tuning approach that first derives reference rank lists from a base model, then simulates perturbations during fine-tuning and enforces rank preservation on the top- items to stabilize rankings without sacrificing next-item accuracy. The key contributions are the introduction of a perturbation-driven fine-tuning framework, a rank-preserving regularization designed for scalable top- preservation, and extensive experiments showing improved rank-list stability (via metrics like and Top- Jaccard) while maintaining strong predictive performance across three real-world datasets and multiple base models. The results demonstrate practical impact for deploying robust, stable recommender systems in settings where small data perturbations could otherwise cause large ranking shifts. The work also discusses limitations and future directions, including extending to other perturbation types and non-sequential or multimodal settings.

Abstract

Modern recommender systems may output considerably different recommendations due to small perturbations in the training data. Changes in the data from a single user will alter the recommendations as well as the recommendations of other users. In applications like healthcare, housing, and finance, this sensitivity can have adverse effects on user experience. We propose a method to stabilize a given recommender system against such perturbations. This is a challenging task due to (1) the lack of a ``reference'' rank list that can be used to anchor the outputs; and (2) the computational challenges in ensuring the stability of rank lists with respect to all possible perturbations of training data. Our method, FINEST, overcomes these challenges by obtaining reference rank lists from a given recommendation model and then fine-tuning the model under simulated perturbation scenarios with rank-preserving regularization on sampled items. Our experiments on real-world datasets demonstrate that FINEST can ensure that recommender models output stable recommendations under a wide range of different perturbations without compromising next-item prediction accuracy.
Paper Structure (27 sections, 6 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 6 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Sequential recommendation models can output drastically different rank lists due to small perturbations in user interaction in the training data. Here, we show a training dataset of user interactions ("original") and a copy that contains minor perturbations ("perturbed", with perturbations highlighted in red). Recommendation systems trained using each dataset will output different rank lists for end-users (right-top and right-middle). Our proposed approach FINEST (right-bottom) will stabilize outputs to ensure that training with a "perturbed dataset" will return rank lists that are as close as possible to the rank list from the "original dataset."
  • Figure 2: Overview of stabilizing a recommender model via fine-tuning with FINEST. First, we obtain reference recommendations for all training instances from a given recommendation model. Next, randomly sampled and perturbed data (changing every epoch) is to fine-tune the model. FINEST adds rank-preserving regularization to minimize differences between the reference and fine-tuned rank lists (generated under pseudo-perturbations). By simulating perturbations, FINEST can generate stable rank lists even in the presence of actual input perturbations.
  • Figure 3: FINEST uses a rank-preserving regularization term (\ref{['eq:topK_regularization_single']}) to penalize differences in ordering and prediction scores of the top-$K$ items with respect to a reference rank list. With the regularizer, the recommender can generate a similar top-$K$ recommendation to the reference one under perturbations. $\Theta_{B}$ and $\Theta_{F}$ indicate $\Theta_\mathit{Base}$ and $\Theta_{\textsc{FINEST}\xspace}$, respectively.
  • Figure 4: Stability of the BERT4Rec model fine-tuned with diverse methods against random and CASPERoh2022robustness deletion perturbations across different datasets.FINEST generates the most stable model against both perturbations as per RBO and Top-10 Jaccard Similarity.
  • Figure 5: Stability of BERT4Rec fine-tuned with and without FINEST as per the number of input perturbations on the LastFM dataset.
  • ...and 2 more figures