Table of Contents
Fetching ...

LLMDistill4Ads: Using Cross-Encoders to Distill from LLM Signals for Advertiser Keyphrase Recommendations

Soumik Dey, Benjamin Braun, Naveen Ravipati, Hansi Wu, Binbin Li

TL;DR

This work tackles the misalignment and bias in advertiser keyphrase recommendations by leveraging a multi-task distillation pipeline that moves knowledge from an LLM-based teacher to a cross-encoder assistant and finally to a lightweight bi-encoder student. By combining CTR, Search Relevance, and LLM-generated labels within a multi-dataset training framework, and by employing distillation losses such as Pearson correlation, the approach achieves improved retrieval quality while maintaining practical latency via Matryoshka embeddings. Offline ablations and an in-production A/B test demonstrate significant business impact, including substantial gains in GMB and ROAS, and higher seller adoption of keyphrases. The proposed evaluation protocol integrates de-duplication, relevance filtering, and LLM-based judgment to approximate real-world performance in a two-sided marketplace, offering a scalable path for production-ready advertiser keyphrase retrieval systems.

Abstract

E-commerce sellers are advised to bid on keyphrases to boost their advertising campaigns. These keyphrases must be relevant to prevent irrelevant items from cluttering search systems and to maintain positive seller perception. It is vital that keyphrase suggestions align with seller, search and buyer judgments. Given the challenges in collecting negative feedback in these systems, LLMs have been used as a scalable proxy to human judgments. This paper presents an empirical study on a major ecommerce platform of a distillation framework involving an LLM teacher, a cross-encoder assistant and a bi-encoder Embedding Based Retrieval (EBR) student model, aimed at mitigating click-induced biases in keyphrase recommendations.

LLMDistill4Ads: Using Cross-Encoders to Distill from LLM Signals for Advertiser Keyphrase Recommendations

TL;DR

This work tackles the misalignment and bias in advertiser keyphrase recommendations by leveraging a multi-task distillation pipeline that moves knowledge from an LLM-based teacher to a cross-encoder assistant and finally to a lightweight bi-encoder student. By combining CTR, Search Relevance, and LLM-generated labels within a multi-dataset training framework, and by employing distillation losses such as Pearson correlation, the approach achieves improved retrieval quality while maintaining practical latency via Matryoshka embeddings. Offline ablations and an in-production A/B test demonstrate significant business impact, including substantial gains in GMB and ROAS, and higher seller adoption of keyphrases. The proposed evaluation protocol integrates de-duplication, relevance filtering, and LLM-based judgment to approximate real-world performance in a two-sided marketplace, offering a scalable path for production-ready advertiser keyphrase retrieval systems.

Abstract

E-commerce sellers are advised to bid on keyphrases to boost their advertising campaigns. These keyphrases must be relevant to prevent irrelevant items from cluttering search systems and to maintain positive seller perception. It is vital that keyphrase suggestions align with seller, search and buyer judgments. Given the challenges in collecting negative feedback in these systems, LLMs have been used as a scalable proxy to human judgments. This paper presents an empirical study on a major ecommerce platform of a distillation framework involving an LLM teacher, a cross-encoder assistant and a bi-encoder Embedding Based Retrieval (EBR) student model, aimed at mitigating click-induced biases in keyphrase recommendations.

Paper Structure

This paper contains 31 sections, 25 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Auction mechanism of items (Itm) in relation to keyphrases (KP). Red strikethrough font represents filter of Advertising, the underline represents seller curation of keyphrases after advertising has filtered them while gray highlight represents the relevance filter of Search.
  • Figure 2: Our proposed architecture for multi-task knowledge distillation. The LLM is distilled to a cross-encoder, which is in turn distilled to the bi-encoder via multi-task hybrid training
  • Figure 3: Interface for our human annotators.
  • Figure 4: Production Serving Architecture for keyphrase recommendations.