GraphEx: A Graph-based Extraction Method for Advertiser Keyphrase Recommendation
Ashirbad Mishra, Soumik Dey, Marshall Wu, Jinyu Zhao, He Yu, Kaichen Ni, Binbin Li, Kamesh Madduri
TL;DR
GraphEx reframes keyphrase recommendation as an extraction-from-titles problem solved via per-leaf bipartite graphs, enabling permutation-based mapping of title tokens to a curated keyphrase universe. It delivers low-latency, GPU-free inference by constructing lightweight CSR graphs, enumerating candidate keyphrases through an efficient token-based expansion, and ranking via a Label Title Alignment score that favors concise, highly relevant phrases. The approach mitigates click-data biases by decoupling recommendations from ground-truth clicks and focuses on head keyphrases under budget constraints, achieving higher diversity and broader advertiser impact in production at eBay. Empirical results, supported by a bespoke AI-based evaluation framework, show GraphEx providing significant incremental gains in revenue and GMV, along with scalable daily model refreshes and favorable latency across billions of items.
Abstract
Online sellers and advertisers are recommended keyphrases for their listed products, which they bid on to enhance their sales. One popular paradigm that generates such recommendations is Extreme Multi-Label Classification (XMC), which involves tagging/mapping keyphrases to items. We outline the limitations of using traditional item-query based tagging or mapping techniques for keyphrase recommendations on E-Commerce platforms. We introduce GraphEx, an innovative graph-based approach that recommends keyphrases to sellers using extraction of token permutations from item titles. Additionally, we demonstrate that relying on traditional metrics such as precision/recall can be misleading in practical applications, thereby necessitating a combination of metrics to evaluate performance in real-world scenarios. These metrics are designed to assess the relevance of keyphrases to items and the potential for buyer outreach. GraphEx outperforms production models at eBay, achieving the objectives mentioned above. It supports near real-time inferencing in resource-constrained production environments and scales effectively for billions of items.
