Weak Supervision for Improved Precision in Search Systems
Sriram Vasudevan
TL;DR
The paper tackles the high cost of manually labeling data for industrial search by introducing a scalable weak supervision pipeline that combines SME-authored labeling functions with a small seed gold dataset to generate high-quality training labels. A probabilistic weak labeler learns from LF agreements and produces a label probability $p$, which relabels training data used in a Listwise Learning to Rank objective for a large deep neural network. Offline results show substantial gains in ranking quality (NDCG@10 improvements around $34\%$-$42\%$) and stronger signal alignment with user interactions, while online deployment confirms improved precision and user-relevant outcomes. The work demonstrates a practical, scalable path to leverage organizational knowledge and limited ground truth to outperform engagement-only labeling in production search systems.
Abstract
Labeled datasets are essential for modern search engines, which increasingly rely on supervised learning methods like Learning to Rank and massive amounts of data to power deep learning models. However, creating these datasets is both time-consuming and costly, leading to the common use of user click and activity logs as proxies for relevance. In this paper, we present a weak supervision approach to infer the quality of query-document pairs and apply it within a Learning to Rank framework to enhance the precision of a large-scale search system.
