Table of Contents
Fetching ...

MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search

Miao Fan, Jiacheng Guo, Shuai Zhu, Shuo Miao, Mingming Sun, Ping Li

TL;DR

This paper will elaborate on how active learning is adopted to overcome the insufficiency of click history at the matching layer when training the authors' neural click networks offline, and how the SOTA ANN search technique is used for retrieving ads more efficiently.

Abstract

Baidu runs the largest commercial web search engine in China, serving hundreds of millions of online users every day in response to a great variety of queries. In order to build a high-efficiency sponsored search engine, we used to adopt a three-layer funnel-shaped structure to screen and sort hundreds of ads from billions of ad candidates subject to the requirement of low response latency and the restraints of computing resources. Given a user query, the top matching layer is responsible for providing semantically relevant ad candidates to the next layer, while the ranking layer at the bottom concerns more about business indicators (e.g., CPM, ROI, etc.) of those ads. The clear separation between the matching and ranking objectives results in a lower commercial return. The Mobius project has been established to address this serious issue. It is our first attempt to train the matching layer to consider CPM as an additional optimization objective besides the query-ad relevance, via directly predicting CTR (click-through rate) from billions of query-ad pairs. Specifically, this paper will elaborate on how we adopt active learning to overcome the insufficiency of click history at the matching layer when training our neural click networks offline, and how we use the SOTA ANN search technique for retrieving ads more efficiently (Here ``ANN'' stands for approximate nearest neighbor search). We contribute the solutions to Mobius-V1 as the first version of our next generation query-ad matching system.

MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search

TL;DR

This paper will elaborate on how active learning is adopted to overcome the insufficiency of click history at the matching layer when training the authors' neural click networks offline, and how the SOTA ANN search technique is used for retrieving ads more efficiently.

Abstract

Baidu runs the largest commercial web search engine in China, serving hundreds of millions of online users every day in response to a great variety of queries. In order to build a high-efficiency sponsored search engine, we used to adopt a three-layer funnel-shaped structure to screen and sort hundreds of ads from billions of ad candidates subject to the requirement of low response latency and the restraints of computing resources. Given a user query, the top matching layer is responsible for providing semantically relevant ad candidates to the next layer, while the ranking layer at the bottom concerns more about business indicators (e.g., CPM, ROI, etc.) of those ads. The clear separation between the matching and ranking objectives results in a lower commercial return. The Mobius project has been established to address this serious issue. It is our first attempt to train the matching layer to consider CPM as an additional optimization objective besides the query-ad relevance, via directly predicting CTR (click-through rate) from billions of query-ad pairs. Specifically, this paper will elaborate on how we adopt active learning to overcome the insufficiency of click history at the matching layer when training our neural click networks offline, and how we use the SOTA ANN search technique for retrieving ads more efficiently (Here ``ANN'' stands for approximate nearest neighbor search). We contribute the solutions to Mobius-V1 as the first version of our next generation query-ad matching system.
Paper Structure (17 sections, 3 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 17 sections, 3 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: A screen-shot of Baidu Search results on mobile phones given a search query "tourism in Alaska" (in Chinese). Our sponsored search engine is in charge of providing helpful ads on each page before the organic search results.
  • Figure 2: The three-layer funnel-shaped structure of the previous sponsored search system in Baidu. Given a user query, it is highly efficient to retrieve hundreds of relevant and high-CPM ads from billions of ad candidates.
  • Figure 3: The distinct objectives of matching and ranking layer lead to lower CPM which is one of the key business indicators of a sponsored search engine. Therefore, we are engaged in building a high-efficient query-ad matching system (i.e., Mobius) in Baidu sponsored search. Mobius is expected to unify the learning objectives of the query-ad relevance and many other business indicators together, subject to lower response latency, limitation of computation resources and tiny adverse impact on user experience. For now, we have deployed the first version of Mobius (Mobius-V1) which can more accurately predict CTR for billions of user query & ad pairs.
  • Figure 4: An example of a bad case that the original CTR model could not handle well. As the neural click network employed by the ranking layer was originally trained by high-frequency ads and queries, it tends to estimate a query-ad pair at a higher CTR once a high-frequency ad (e.g., "Mercedes-Benz" in this case) appears, even though "White Rose" and "Mercedes-Benz" have little relevance.
  • Figure 5: The flow diagram of actively training our neural click model with augmented data. The iterative procedure has two phases: data augmentation and CTR model learning. The phase of data augmentation starts from loading a batch of click history (i.e., user query & ad pairs) into a data augmenter. The data augmenter adopts a cross join operation to generate more user query & ad pairs even though they do not appear in the click history. Then we bring in the original matching model as a teacher to grade the relevance of those pairs. The teacher set a threshold to retain the irrelevant query-ad pairs which are further fed into our neural click network. Our neural click network acts as a student and tries to predict the CTRs of query-ad pairs. A data sampler is responsible for sampling and labeling the bad cases (i.e., user query & ad pairs with lower relevance but higher CTR). After the training buffer is augmented by the bad cases, we start the second phase to train our neural click model which predicts the probability of distribution in three categories: click, unclick and bad. Once the augmented data in the buffer have been used, we clean the buffer and wait for loading the next batch of click history.
  • ...and 1 more figures