Collaborative Semantic Alignment in Recommendation Systems
Chen Wang, Liangwei Yang, Zhiwei Liu, Xiaolong Liu, Mingdai Yang, Yueqing Liang, Philip S. Yu
TL;DR
CARec tackles the gap between collaborative filtering and semantic item representations by introducing Reciprocal Alignment, a two-phase training paradigm that first aligns user embeddings to item semantics and then refines item representations with an adaptor to preserve semantic information. Items act as teachers and users as learners in the semantic aligning phase, after which users switch roles to teach items during the collaborative refining phase, using an MLP adaptor to inject collaborative signals while keeping semantic semantics intact. Empirical results on four real-world datasets show CARec achieves state-of-the-art performance in both warm and cold-start settings, outperforming ID-based and text-based baselines and enabling effective cold-item recommendations without extra modules. A case study and extensive ablations demonstrate that maintaining item semantic integrity while incorporating collaborative signals is key to CARec’s success, with instructor-xl often delivering the strongest semantic embeddings. The work highlights practical impact for robust recommendations in dynamic inventory and sparse-interaction domains, and points to future directions in multi-domain semantic preservation and more nuanced role-switch indicators.
Abstract
Traditional recommender systems primarily leverage identity-based (ID) representations for users and items, while the advent of pre-trained language models (PLMs) has introduced rich semantic modeling of item descriptions. However, PLMs often overlook the vital collaborative filtering signals, leading to challenges in merging collaborative and semantic representation spaces and fine-tuning semantic representations for better alignment with warm-start conditions. Our work introduces CARec, a cutting-edge model that integrates collaborative filtering with semantic representations, ensuring the alignment of these representations within the semantic space while retaining key semantics. Our experiments across four real-world datasets show significant performance improvements. CARec's collaborative alignment approach also extends its applicability to cold-start scenarios, where it demonstrates notable enhancements in recommendation accuracy. The code will be available upon paper acceptance.
