Deep Uncertainty-Based Explore for Index Construction and Retrieval in Recommendation System
Xin Jiang, Kaiqiang Wang, Yinlong Wang, Fengchang Lv, Taiyang Peng, Shuai Yang, Xianteng Wu, Pengye Zhang, Shuo Yuan, Yifan Zeng
TL;DR
Uncertainty-aware exploration is introduced in deep index construction and retrieval for recommendation systems. The method UICR adds uncertainty modeling to both index construction (UN-Index) and retrieval (UN-Retrieval) via a multi-task framework (UN-Model) that predicts both relevance scores and uncertainties for item-to-item and user-to-item pairs. The approach uses variance-weighted distances in the index and a variance-weighted beam search to fuse scores with uncertainty, improving novelty while maintaining recall. Experiments on public datasets and Shopee production data show improved Recall@N and novelty metrics, with online A/B showing revenue and CTR gains in display advertising. The work demonstrates a practical path to balancing immediate relevance and long-term exploration in industrial-scale recommender systems.
Abstract
In recommendation systems, the relevance and novelty of the final results are selected through a cascade system of Matching -> Ranking -> Strategy. The matching model serves as the starting point of the pipeline and determines the upper bound of the subsequent stages. Balancing the relevance and novelty of matching results is a crucial step in the design and optimization of recommendation systems, contributing significantly to improving recommendation quality. However, the typical matching algorithms have not simultaneously addressed the relevance and novelty perfectly. One main reason is that deep matching algorithms exhibit significant uncertainty when estimating items in the long tail (e.g., due to insufficient training samples) items.The uncertainty not only affects the training of the models but also influences the confidence in the index construction and beam search retrieval process of these models. This paper proposes the UICR (Uncertainty-based explore for Index Construction and Retrieval) algorithm, which introduces the concept of uncertainty modeling in the matching stage and achieves multi-task modeling of model uncertainty and index uncertainty. The final matching results are obtained by combining the relevance score and uncertainty score infered by the model. Experimental results demonstrate that the UICR improves novelty without sacrificing relevance on realworld industrial productive environments and multiple open-source datasets. Remarkably, online A/B test results of display advertising in Shopee demonstrates the effectiveness of the proposed algorithm.
