From Pairwise to Ranking: Climbing the Ladder to Ideal Collaborative Filtering with Pseudo-Ranking
Yuhan Zhao, Rui Chen, Li Chen, Shuang Zhang, Qilong Han, Hongtao Song
TL;DR
The paper addresses the problem that ideal collaborative filtering, which optimizes over full user rankings, is unattainable in practice due to missing full rankings and lack of losses that exploit ranking information. It introduces a pseudo-ranking paradigm (PRP) comprising a ranker that generates pseudo-rankings via a noise-injected signal and a novel ranking loss with a gradient-based confidence mechanism, trained alongside a secondary loss $\\mathcal{L}_p$ to supervise ranking learning. The approach reframes ranking as a multiple ordinal classification problem and establishes a differentiable loss that enforces ordinal ordering, with a gradient-density based weighting to downweight unreliable pseudo-rankings. Empirical results on four real-world datasets show that PRP consistently outperforms state-of-the-art methods and can substantially boost existing CF models, demonstrating a practical path toward closer-to-ideal CF performance.
Abstract
Intuitively, an ideal collaborative filtering (CF) model should learn from users' full rankings over all items to make optimal top-K recommendations. Due to the absence of such full rankings in practice, most CF models rely on pairwise loss functions to approximate full rankings, resulting in an immense performance gap. In this paper, we provide a novel analysis using the multiple ordinal classification concept to reveal the inevitable gap between a pairwise approximation and the ideal case. However, bridging the gap in practice encounters two formidable challenges: (1) none of the real-world datasets contains full ranking information; (2) there does not exist a loss function that is capable of consuming ranking information. To overcome these challenges, we propose a pseudo-ranking paradigm (PRP) that addresses the lack of ranking information by introducing pseudo-rankings supervised by an original noise injection mechanism. Additionally, we put forward a new ranking loss function designed to handle ranking information effectively. To ensure our method's robustness against potential inaccuracies in pseudo-rankings, we equip the ranking loss function with a gradient-based confidence mechanism to detect and mitigate abnormal gradients. Extensive experiments on four real-world datasets demonstrate that PRP significantly outperforms state-of-the-art methods.
