Towards Principled Learning for Re-ranking in Recommender Systems
Qunwei Li, Linghui Li, Jianbin Lin, Wenliang Zhong
TL;DR
The paper tackles the absence of principled training signals for listwise re-ranking in recommender systems and introduces two principles—Convergence Consistency and Adversarial Consistency—to regularize learning. An algorithm integrates these principles into the re-ranker loss via a Contrastive Similarity term, applicable to generic architectures, with formulas such as $\mathcal{L}_{CS}(\bm{P}_A,\bm{P}_B) = |\bm{P}_A-\bm{P}_B|^T (\mathcal{R}(\bm{X},\bm{P}_A) - \mathcal{R}(\bm{X},\bm{P}_B))^2$ and a combined loss that enforces convergence and robustness. Experiments on Ad and PRM Public datasets show consistent improvements across baselines in metrics like $\text{AUC}$, $\text{NDCG}$, and $\text{MAP}@k$, demonstrating robustness and broad applicability. The results suggest that principled, loss-level regularization can substantially boost the quality of final ranked lists in real-world recommender systems.
Abstract
As the final stage of recommender systems, re-ranking presents ordered item lists to users that best match their interests. It plays such a critical role and has become a trending research topic with much attention from both academia and industry. Recent advances of re-ranking are focused on attentive listwise modeling of interactions and mutual influences among items to be re-ranked. However, principles to guide the learning process of a re-ranker, and to measure the quality of the output of the re-ranker, have been always missing. In this paper, we study such principles to learn a good re-ranker. Two principles are proposed, including convergence consistency and adversarial consistency. These two principles can be applied in the learning of a generic re-ranker and improve its performance. We validate such a finding by various baseline methods over different datasets.
