Make Large Language Model a Better Ranker
Wen-Shuo Chao, Zhi Zheng, Hengshu Zhu, Hao Liu
TL;DR
The paper tackles the misalignment between large language model generation and ranking tasks in recommender systems by proposing ALRO, a framework that combines Soft Lambda Loss (SLL) and Permutation-Sensitive Loss (PSL) within a supervised fine-tuning regime. By casting ranking as a language-generation problem with explicit feedback, employing a differentiable ranking signal via soft-argmax, and mitigating input-order bias through PSL, ALRO achieves superior $NDCG@k$ performance over embedding-based and other LLM-based baselines across multiple datasets and backbone models. The approach demonstrates notable gains with larger models and offers efficiency advantages over bootstrapping strategies, suggesting practical viability for LLM-driven re-ranking. Overall, ALRO advances the integration of ranking objectives into LLMs, delivering improved recommendation quality while maintaining feasible inference costs and scalability considerations for real-world systems.
Abstract
Large Language Models (LLMs) demonstrate robust capabilities across various fields, leading to a paradigm shift in LLM-enhanced Recommender System (RS). Research to date focuses on point-wise and pair-wise recommendation paradigms, which are inefficient for LLM-based recommenders due to high computational costs. However, existing list-wise approaches also fall short in ranking tasks due to misalignment between ranking objectives and next-token prediction. Moreover, these LLM-based methods struggle to effectively address the order relation among candidates, particularly given the scale of ratings. To address these challenges, this paper introduces the large language model framework with Aligned Listwise Ranking Objectives (ALRO). ALRO is designed to bridge the gap between the capabilities of LLMs and the nuanced requirements of ranking tasks. Specifically, ALRO employs explicit feedback in a listwise manner by introducing soft lambda loss, a customized adaptation of lambda loss designed for optimizing order relations. This mechanism provides more accurate optimization goals, enhancing the ranking process. Additionally, ALRO incorporates a permutation-sensitive learning mechanism that addresses position bias, a prevalent issue in generative models, without imposing additional computational burdens during inference. Our evaluative studies reveal that ALRO outperforms both existing embedding-based recommendation methods and LLM-based recommendation baselines.
