Table of Contents
Fetching ...

pEBR: A Probabilistic Approach to Embedding Based Retrieval

Han Zhang, Yunjiang Jiang, Mingming Li, Haowei Yuan, Yiming Qiu, Wen-Yun Yang

TL;DR

pEBR reframes embedding-based retrieval as a probabilistic task by modeling the item distribution conditioned on each query, enabling a dynamic cutoff via the distribution's CDF. It implements two probabilistic families, ExpNCE and BetaNCE, within a two-tower framework, and demonstrates improved recall and precision over fixed-topk and fixed-score baselines in offline and online experiments. The key contributions are a formal MLE/NCE-based probabilistic formulation for retrieval, two concrete instantiations (ExpNCE and BetaNCE) with tractable CDF-based thresholds, and extensive validation showing gains across head, torso, and tail queries along with real-world online gains. This approach offers practical impact by enabling adaptive candidate sets that align with query popularity, reducing irrelevant results for tail queries while increasing recall for head queries in large-scale systems.

Abstract

Embedding-based retrieval aims to learn a shared semantic representation space for both queries and items, enabling efficient and effective item retrieval through approximate nearest neighbor (ANN) algorithms. In current industrial practice, retrieval systems typically retrieve a fixed number of items for each query. However, this fixed-size retrieval often results in insufficient recall for head queries and low precision for tail queries. This limitation largely stems from the dominance of frequentist approaches in loss function design, which fail to address this challenge in industry. In this paper, we propose a novel \textbf{p}robabilistic \textbf{E}mbedding-\textbf{B}ased \textbf{R}etrieval (\textbf{pEBR}) framework. Our method models the item distribution conditioned on each query, enabling the use of a dynamic cosine similarity threshold derived from the cumulative distribution function (CDF) of the probabilistic model. Experimental results demonstrate that pEBR significantly improves both retrieval precision and recall. Furthermore, ablation studies reveal that the probabilistic formulation effectively captures the inherent differences between head-to-tail queries.

pEBR: A Probabilistic Approach to Embedding Based Retrieval

TL;DR

pEBR reframes embedding-based retrieval as a probabilistic task by modeling the item distribution conditioned on each query, enabling a dynamic cutoff via the distribution's CDF. It implements two probabilistic families, ExpNCE and BetaNCE, within a two-tower framework, and demonstrates improved recall and precision over fixed-topk and fixed-score baselines in offline and online experiments. The key contributions are a formal MLE/NCE-based probabilistic formulation for retrieval, two concrete instantiations (ExpNCE and BetaNCE) with tractable CDF-based thresholds, and extensive validation showing gains across head, torso, and tail queries along with real-world online gains. This approach offers practical impact by enabling adaptive candidate sets that align with query popularity, reducing irrelevant results for tail queries while increasing recall for head queries in large-scale systems.

Abstract

Embedding-based retrieval aims to learn a shared semantic representation space for both queries and items, enabling efficient and effective item retrieval through approximate nearest neighbor (ANN) algorithms. In current industrial practice, retrieval systems typically retrieve a fixed number of items for each query. However, this fixed-size retrieval often results in insufficient recall for head queries and low precision for tail queries. This limitation largely stems from the dominance of frequentist approaches in loss function design, which fail to address this challenge in industry. In this paper, we propose a novel \textbf{p}robabilistic \textbf{E}mbedding-\textbf{B}ased \textbf{R}etrieval (\textbf{pEBR}) framework. Our method models the item distribution conditioned on each query, enabling the use of a dynamic cosine similarity threshold derived from the cumulative distribution function (CDF) of the probabilistic model. Experimental results demonstrate that pEBR significantly improves both retrieval precision and recall. Furthermore, ablation studies reveal that the probabilistic formulation effectively captures the inherent differences between head-to-tail queries.

Paper Structure

This paper contains 29 sections, 18 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Relevant item distributions projected to a sphere for queries in head, torso, and tail categories, respectively.
  • Figure 2: Histogram of the retrieved item numbers for head, torso, and tail queries, with the CDF value set to 0.985.