Table of Contents
Fetching ...

Warming Up Cold-Start CTR Prediction by Learning Item-Specific Feature Interactions

Yaqing Wang, Hongming Piao, Daxiang Dong, Quanming Yao, Jingbo Zhou

TL;DR

EmerG tackles cold-start CTR by shifting from global to item-specific feature interactions. It generates item-specific feature graphs with hypernetworks and processes them using a customized GNN whose message passing explicitly captures interactions up to any order, then refines these with a meta-learning framework to prevent overfitting across tasks. Experiments on MovieLens and Taobao demonstrate state-of-the-art performance in cold-start, warm-up, and data-rich regimes, with insightful visualizations showing interpretable, item-tailored interaction patterns. The approach offers practical benefits for deploying CTR systems with rapidly evolving item catalogs and varying data availability, and suggests applicability to other domains with incremental item histories.

Abstract

In recommendation systems, new items are continuously introduced, initially lacking interaction records but gradually accumulating them over time. Accurately predicting the click-through rate (CTR) for these items is crucial for enhancing both revenue and user experience. While existing methods focus on enhancing item ID embeddings for new items within general CTR models, they tend to adopt a global feature interaction approach, often overshadowing new items with sparse data by those with abundant interactions. Addressing this, our work introduces EmerG, a novel approach that warms up cold-start CTR prediction by learning item-specific feature interaction patterns. EmerG utilizes hypernetworks to generate an item-specific feature graph based on item characteristics, which is then processed by a Graph Neural Network (GNN). This GNN is specially tailored to provably capture feature interactions at any order through a customized message passing mechanism. We further design a meta learning strategy that optimizes parameters of hypernetworks and GNN across various item CTR prediction tasks, while only adjusting a minimal set of item-specific parameters within each task. This strategy effectively reduces the risk of overfitting when dealing with limited data. Extensive experiments on benchmark datasets validate that EmerG consistently performs the best given no, a few and sufficient instances of new items.

Warming Up Cold-Start CTR Prediction by Learning Item-Specific Feature Interactions

TL;DR

EmerG tackles cold-start CTR by shifting from global to item-specific feature interactions. It generates item-specific feature graphs with hypernetworks and processes them using a customized GNN whose message passing explicitly captures interactions up to any order, then refines these with a meta-learning framework to prevent overfitting across tasks. Experiments on MovieLens and Taobao demonstrate state-of-the-art performance in cold-start, warm-up, and data-rich regimes, with insightful visualizations showing interpretable, item-tailored interaction patterns. The approach offers practical benefits for deploying CTR systems with rapidly evolving item catalogs and varying data availability, and suggests applicability to other domains with incremental item histories.

Abstract

In recommendation systems, new items are continuously introduced, initially lacking interaction records but gradually accumulating them over time. Accurately predicting the click-through rate (CTR) for these items is crucial for enhancing both revenue and user experience. While existing methods focus on enhancing item ID embeddings for new items within general CTR models, they tend to adopt a global feature interaction approach, often overshadowing new items with sparse data by those with abundant interactions. Addressing this, our work introduces EmerG, a novel approach that warms up cold-start CTR prediction by learning item-specific feature interaction patterns. EmerG utilizes hypernetworks to generate an item-specific feature graph based on item characteristics, which is then processed by a Graph Neural Network (GNN). This GNN is specially tailored to provably capture feature interactions at any order through a customized message passing mechanism. We further design a meta learning strategy that optimizes parameters of hypernetworks and GNN across various item CTR prediction tasks, while only adjusting a minimal set of item-specific parameters within each task. This strategy effectively reduces the risk of overfitting when dealing with limited data. Extensive experiments on benchmark datasets validate that EmerG consistently performs the best given no, a few and sufficient instances of new items.
Paper Structure (34 sections, 1 theorem, 19 equations, 5 figures, 5 tables, 2 algorithms)

This paper contains 34 sections, 1 theorem, 19 equations, 5 figures, 5 tables, 2 algorithms.

Key Result

Proposition 4.1

With the customized message passing process defined in eq:gnn-update, the $(l-1)$th GNN layer captures $l$-order feature interactions.

Figures (5)

  • Figure 1: Illustration of the proposed EmerG, designed to enhance CTR predictions of newly emerging items through the learning of item-specific feature interaction patterns. EmerG uses hypernetworks to generate an initial item-specific adjacency matrix for a feature graph, with nodes representing user and item features and edges denoting their interactions, based on item feature embeddings. Higher-order adjacency matrices for subsequent GNN layers are generated from the initial matrix, reducing both model complexity and storage requirements. The GNN's message passing process is tailored to capture $l$-order feature interactions at the $l-1$th layer, enabling nuanced integration of various interaction orders for accurate predictions. EmerG optimizes the parameters of hypernetworks and GNN across diverse CTR prediction tasks to enhance generalization, while utilizing minimal item-specific parameters to capture the uniqueness of new items, which are adaptable with the introduction of additional item instances.
  • Figure 2: Comparing EmerG with DeepFM and CVAR given sufficient training samples.
  • Figure 3: Ablation study on MovieLens and Taobao.
  • Figure 4: Varying the number of GNN layers in EmerG.
  • Figure 5: Visualizations of item-specific adjacency matrices of movie Lawnmower Man 2: Beyond Cyberspace ($\mathbf{A}^{(i)}_{\text{W}}$) and movie Waiting to Exhale ($\mathbf{A}^{(i)}_{\text{L}}$) in MovieLens generated by EmerG.

Theorems & Definitions (2)

  • Proposition 4.1: Efficacy of EmerG.
  • proof