Semantic Convergence: Harmonizing Recommender Systems via Two-Stage Alignment and Behavioral Semantic Tokenization
Guanghan Li, Xun Zhang, Yufei Zhang, Yifan Yin, Guojun Yin, Wei Lin
TL;DR
The paper tackles the gap between sparse collaborative signals in traditional recommender systems and dense language representations in large language models by introducing a two-stage alignment framework. Alignment Tokenization converts item IDs into a compact, semantically aligned token space via cascaded CodeBooks, while Alignment Task fine-tunes the LLM with sequential, textual, and query-oriented signals plus negative sampling. A dedicated inference strategy caches top-K item codes to reduce latency, enabling scalable, end-to-end LLM-assisted recommendations. Empirical results on three Amazon datasets demonstrate improved recall and NDCG, with ablations confirming the benefits of each component and a scaling trend suggesting larger LLMs further enhance performance. Overall, the approach offers a practical path to integrate LLM reasoning into recommender systems with improved efficiency and scalability.
Abstract
Large language models (LLMs), endowed with exceptional reasoning capabilities, are adept at discerning profound user interests from historical behaviors, thereby presenting a promising avenue for the advancement of recommendation systems. However, a notable discrepancy persists between the sparse collaborative semantics typically found in recommendation systems and the dense token representations within LLMs. In our study, we propose a novel framework that harmoniously merges traditional recommendation models with the prowess of LLMs. We initiate this integration by transforming ItemIDs into sequences that align semantically with the LLMs space, through the proposed Alignment Tokenization module. Additionally, we design a series of specialized supervised learning tasks aimed at aligning collaborative signals with the subtleties of natural language semantics. To ensure practical applicability, we optimize online inference by pre-caching the top-K results for each user, reducing latency and improving effciency. Extensive experimental evidence indicates that our model markedly improves recall metrics and displays remarkable scalability of recommendation systems.
