Matryoshka Representation Learning for Recommendation
Riwei Lai, Li Chen, Weixin Chen, Rui Chen
TL;DR
This work addresses the mismatch between flat embeddings and real-world hierarchical user preferences and item features in recommender systems. It proposes Matryoshka Representation Learning for Recommendation (MRL4Rec), which encodes users and items into a sequence of increasing, overlapping vector spaces and optimizes each level with a level-aware BPR objective; a theoretical analysis clarifies why level-specific training triplets are necessary and motivates a dedicated matryoshka negative sampling (MNS) scheme. Empirically, MRL4Rec (MRL-M) consistently outperforms state-of-the-art entangled and competitive disentangled methods on real datasets, with notable gains on Recall@K and NDCG@K, and ablation confirms the importance of MNS and the hierarchical structure. The approach enables more accurate capture of hierarchical preferences and features while maintaining efficiency, offering a practical path to improved recommendations and a framework for future hierarchical representation learning.
Abstract
Representation learning is essential for deep-neural-network-based recommender systems to capture user preferences and item features within fixed-dimensional user and item vectors. Unlike existing representation learning methods that either treat each user preference and item feature uniformly or categorize them into discrete clusters, we argue that in the real world, user preferences and item features are naturally expressed and organized in a hierarchical manner, leading to a new direction for representation learning. In this paper, we introduce a novel matryoshka representation learning method for recommendation (MRL4Rec), by which we restructure user and item vectors into matryoshka representations with incrementally dimensional and overlapping vector spaces to explicitly represent user preferences and item features at different hierarchical levels. We theoretically establish that constructing training triplets specific to each level is pivotal in guaranteeing accurate matryoshka representation learning. Subsequently, we propose the matryoshka negative sampling mechanism to construct training triplets, which further ensures the effectiveness of the matryoshka representation learning in capturing hierarchical user preferences and item features. The experiments demonstrate that MRL4Rec can consistently and substantially outperform a number of state-of-the-art competitors on several real-life datasets. Our code is publicly available at https://github.com/Riwei-HEU/MRL.
