Table of Contents
Fetching ...

Mitigating Pooling Bias in E-commerce Search via False Negative Estimation

Xiaochen Wang, Xiao Xiao, Ruhan Zhang, Xuan Zhang, Taesik Na, Tejaswi Tenneti, Haixun Wang, Fenglong Ma

TL;DR

This work tackles pooling bias in e commerce search caused by false negatives during negative sampling. It introduces Bias-mitigating Hard Negative Sampling (BHNS), which uses False Negative Estimation to assign a probability that a sampled pair is actually relevant and then regularizes sampling while generating pseudo labels. The approach yields consistent performance gains on semantic similarity benchmarks, offline Instacart data, and a production-like search system, while incurring a modest training-time increase. The results demonstrate domain-agnostic potential for BHNS and its practical impact on improving cross-encoder based relevance assessment in e commerce and beyond.

Abstract

Efficient and accurate product relevance assessment is critical for user experiences and business success. Training a proficient relevance assessment model requires high-quality query-product pairs, often obtained through negative sampling strategies. Unfortunately, current methods introduce pooling bias by mistakenly sampling false negatives, diminishing performance and business impact. To address this, we present Bias-mitigating Hard Negative Sampling (BHNS), a novel negative sampling strategy tailored to identify and adjust for false negatives, building upon our original False Negative Estimation algorithm. Our experiments in the Instacart search setting confirm BHNS as effective for practical e-commerce use. Furthermore, comparative analyses on public dataset showcase its domain-agnostic potential for diverse applications.

Mitigating Pooling Bias in E-commerce Search via False Negative Estimation

TL;DR

This work tackles pooling bias in e commerce search caused by false negatives during negative sampling. It introduces Bias-mitigating Hard Negative Sampling (BHNS), which uses False Negative Estimation to assign a probability that a sampled pair is actually relevant and then regularizes sampling while generating pseudo labels. The approach yields consistent performance gains on semantic similarity benchmarks, offline Instacart data, and a production-like search system, while incurring a modest training-time increase. The results demonstrate domain-agnostic potential for BHNS and its practical impact on improving cross-encoder based relevance assessment in e commerce and beyond.

Abstract

Efficient and accurate product relevance assessment is critical for user experiences and business success. Training a proficient relevance assessment model requires high-quality query-product pairs, often obtained through negative sampling strategies. Unfortunately, current methods introduce pooling bias by mistakenly sampling false negatives, diminishing performance and business impact. To address this, we present Bias-mitigating Hard Negative Sampling (BHNS), a novel negative sampling strategy tailored to identify and adjust for false negatives, building upon our original False Negative Estimation algorithm. Our experiments in the Instacart search setting confirm BHNS as effective for practical e-commerce use. Furthermore, comparative analyses on public dataset showcase its domain-agnostic potential for diverse applications.
Paper Structure (32 sections, 8 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 32 sections, 8 equations, 3 figures, 6 tables, 1 algorithm.

Figures (3)

  • Figure 1: In e-commerce scenario, conventional negative sampling usually assumes the irrelevance between original query and sampled products, which introduces pooling bias by producing false negative pairs such as [Honey, Wildflower Honey] or [Apple, Apple Sauce]. These samples are wrongly labeled as irrelevant pairs, thus harm the performance of search model.
  • Figure 2: Bias-mitigating Hard Negative Sampling. By leveraging False Negative Estimation, the sampler detects potential false negatives and functions by (1) reducing the weight of potential false negatives during hard negative sampling and (2) informing the model of the likelihood of false negatives through the pseudo label.
  • Figure 3: Overview of Instacart's ranking system incorporated with BHNS-boosted cross encoder. The cross encoder is independently trained with Instacart search dataset (see Section \ref{['sec:dataset']}). During the inference time, the output of cross encoder boosted by BHNS serves as a necessary feature in the Instacart ranking system, facilitating the ranking process along with other features.