Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction
Xiang Xu, Hao Wang, Wei Guo, Luankang Zhang, Wanshan Yang, Runlong Yu, Yong Liu, Defu Lian, Enhong Chen
TL;DR
MIRRN introduces multi-granularity interest retrieval and subsequence refinement to address limitations of prior long-term user behavior models in CTR prediction. It constructs target, local, and global queries to retrieve diverse subsequences via SimHash-based search, then refines these subsequences with Target-Aware Position Encoding and a Multi-Head Fourier Transformer to capture sequential and interactive information. The model activates the contributions of each granularity through multi-head target attention, and aggregates this into a CTR prediction via an MLP, achieving superior performance on three large-scale datasets and an online A/B test. The approach delivers practical benefits with modest inference-time overhead, validated by ablations and real-world deployment in Huawei Music.
Abstract
Click-through Rate (CTR) prediction is crucial for online personalization platforms. Recent advancements have shown that modeling rich user behaviors can significantly improve the performance of CTR prediction. Current long-term user behavior modeling algorithms predominantly follow two cascading stages. The first stage retrieves subsequence related to the target item from the long-term behavior sequence, while the second stage models the relationship between the subsequence and the target item. Despite significant progress, these methods have two critical flaws. First, the retrieval query typically includes only target item information, limiting the ability to capture the user's diverse interests. Second, relational information, such as sequential and interactive information within the subsequence, is frequently overlooked. Therefore, it requires to be further mined to more accurately model user interests. To this end, we propose Multi-granularity Interest Retrieval and Refinement Network (MIRRN). Specifically, we first construct queries based on behaviors observed at different time scales to obtain subsequences, each capturing users' interest at various granularities. We then introduce an noval multi-head Fourier transformer to efficiently learn sequential and interactive information within the subsequences, leading to more accurate modeling of user interests. Finally, we employ multi-head target attention to adaptively assess the impact of these multi-granularity interests on the target item. Extensive experiments have demonstrated that MIRRN significantly outperforms state-of-the-art baselines. Furthermore, an A/B test shows that MIRRN increases the average number of listening songs by 1.32% and the average time of listening songs by 0.55% on the Huawei Music App. The implementation code is publicly available at https://github.com/USTC-StarTeam/MIRRN.
