RimiRec: Modeling Refined Multi-interest in Hierarchical Structure for Recommendation
Haolei Pei, Yuanyuan Xu, Yangping Zhu, Yuan Nie
TL;DR
This work addresses the challenge of capturing refined user interests that manifest across hierarchical circles in large-scale recommender systems. It introduces RimiRec, a two-stage approach: (1) hierarchical multi-interest mining uses hierarchical $k$-means to build an item tree and a Transformer-based seq2seq model with early stopping and beam search to generate semantic category IDs at multiple levels; (2) refined multi-interest retrieval partitions the item library by these categories and learns multiple category-specific embeddings via a modified two-tower model, enabling ANN searches within sub-libraries. The method achieves state-of-the-art recall on offline datasets and demonstrates production viability through online A/B tests at Lofter, with improvements in metrics like diversity and user activity. The work provides a practical pipeline for deploying hierarchical, circle-aware recommendations in industrial settings, balancing long behavior sequences with fine-grained interests across levels.
Abstract
Industrial recommender systems usually consist of the retrieval stage and the ranking stage, to handle the billion-scale of users and items. The retrieval stage retrieves candidate items relevant to user interests for recommendations and has attracted much attention. Frequently, a user shows refined multi-interests in a hierarchical structure. For example, a user likes Conan and Kuroba Kaito, which are the roles in hierarchical structure "Animation, Japanese Animation, Detective Conan". However, most existing methods ignore this hierarchical nature, and simply average the fine-grained interest information. Therefore, we propose a novel two-stage approach to explicitly modeling refined multi-interest in a hierarchical structure for recommendation. In the first hierarchical multi-interest mining stage, the hierarchical clustering and transformer-based model adaptively generate circles or sub-circles that users are interested in. In the second stage, the partition of retrieval space allows the EBR models to deal only with items within each circle and accurately capture users' refined interests. Experimental results show that the proposed approach achieves state-of-the-art performance. Our framework has also been deployed at Lofter.
