FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning
Sejoon Oh, Berk Ustun, Julian McAuley, Srijan Kumar
TL;DR
This work tackles the problem of instability in sequential recommender outputs caused by perturbations to training data. It introduces FINEST, a model-agnostic fine-tuning approach that first derives reference rank lists from a base model, then simulates perturbations during fine-tuning and enforces rank preservation on the top-$K$ items to stabilize rankings without sacrificing next-item accuracy. The key contributions are the introduction of a perturbation-driven fine-tuning framework, a rank-preserving regularization designed for scalable top-$K$ preservation, and extensive experiments showing improved rank-list stability (via $RLS$ metrics like $RBO$ and Top-$K$ Jaccard) while maintaining strong predictive performance across three real-world datasets and multiple base models. The results demonstrate practical impact for deploying robust, stable recommender systems in settings where small data perturbations could otherwise cause large ranking shifts. The work also discusses limitations and future directions, including extending to other perturbation types and non-sequential or multimodal settings.
Abstract
Modern recommender systems may output considerably different recommendations due to small perturbations in the training data. Changes in the data from a single user will alter the recommendations as well as the recommendations of other users. In applications like healthcare, housing, and finance, this sensitivity can have adverse effects on user experience. We propose a method to stabilize a given recommender system against such perturbations. This is a challenging task due to (1) the lack of a ``reference'' rank list that can be used to anchor the outputs; and (2) the computational challenges in ensuring the stability of rank lists with respect to all possible perturbations of training data. Our method, FINEST, overcomes these challenges by obtaining reference rank lists from a given recommendation model and then fine-tuning the model under simulated perturbation scenarios with rank-preserving regularization on sampled items. Our experiments on real-world datasets demonstrate that FINEST can ensure that recommender models output stable recommendations under a wide range of different perturbations without compromising next-item prediction accuracy.
