Table of Contents
Fetching ...

Addressing Cold-start Problem in Click-Through Rate Prediction via Supervised Diffusion Modeling

Wenqiao Zhu, Lulu Wang, Jun Wu

TL;DR

The paper tackles the item cold-start problem in CTR prediction by introducing CSDM, a supervised diffusion model that learns a transition between pre-existing item ID embeddings and side information features. The approach uses a non-Markovian forward diffusion to fuse embeddings with side information and a learned reverse process to generate warmed-up embeddings, optimized with a combined CTR loss and diffusion objective L = L_ctr + rho L_diff, while enabling sub-sequence acceleration for training. Experiments on three public CTR datasets show that CSDM outperforms state-of-the-art cold-start baselines and is generalizable across backbone models, with the practical advantage of no inference-time overhead since warmed embeddings are written back to the original ID embedding space. Overall, the work provides a diffusion-based mechanism to reduce cold-start performance gaps in CTR, offering a model-agnostic and practically efficient solution for industrial recommendation systems.

Abstract

Predicting Click-Through Rates is a crucial function within recommendation and advertising platforms, as the output of CTR prediction determines the order of items shown to users. The Embedding \& MLP paradigm has become a standard approach for industrial recommendation systems and has been widely deployed. However, this paradigm suffers from cold-start problems, where there is either no or only limited user action data available, leading to poorly learned ID embeddings. The cold-start problem hampers the performance of new items. To address this problem, we designed a novel diffusion model to generate a warmed-up embedding for new items. Specifically, we define a novel diffusion process between the ID embedding space and the side information space. In addition, we can derive a sub-sequence from the diffusion steps to expedite training, given that our diffusion model is non-Markovian. Our diffusion model is supervised by both the variational inference and binary cross-entropy objectives, enabling it to generate warmed-up embeddings for items in both the cold-start and warm-up phases. Additionally, we have conducted extensive experiments on three recommendation datasets. The results confirmed the effectiveness of our approach.

Addressing Cold-start Problem in Click-Through Rate Prediction via Supervised Diffusion Modeling

TL;DR

The paper tackles the item cold-start problem in CTR prediction by introducing CSDM, a supervised diffusion model that learns a transition between pre-existing item ID embeddings and side information features. The approach uses a non-Markovian forward diffusion to fuse embeddings with side information and a learned reverse process to generate warmed-up embeddings, optimized with a combined CTR loss and diffusion objective L = L_ctr + rho L_diff, while enabling sub-sequence acceleration for training. Experiments on three public CTR datasets show that CSDM outperforms state-of-the-art cold-start baselines and is generalizable across backbone models, with the practical advantage of no inference-time overhead since warmed embeddings are written back to the original ID embedding space. Overall, the work provides a diffusion-based mechanism to reduce cold-start performance gaps in CTR, offering a model-agnostic and practically efficient solution for industrial recommendation systems.

Abstract

Predicting Click-Through Rates is a crucial function within recommendation and advertising platforms, as the output of CTR prediction determines the order of items shown to users. The Embedding \& MLP paradigm has become a standard approach for industrial recommendation systems and has been widely deployed. However, this paradigm suffers from cold-start problems, where there is either no or only limited user action data available, leading to poorly learned ID embeddings. The cold-start problem hampers the performance of new items. To address this problem, we designed a novel diffusion model to generate a warmed-up embedding for new items. Specifically, we define a novel diffusion process between the ID embedding space and the side information space. In addition, we can derive a sub-sequence from the diffusion steps to expedite training, given that our diffusion model is non-Markovian. Our diffusion model is supervised by both the variational inference and binary cross-entropy objectives, enabling it to generate warmed-up embeddings for items in both the cold-start and warm-up phases. Additionally, we have conducted extensive experiments on three recommendation datasets. The results confirmed the effectiveness of our approach.

Paper Structure

This paper contains 18 sections, 1 theorem, 28 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Given the definitions of $q_\sigma(\mathbf{z}_{1:T} | \mathbf{z}_0, \mathbf{h})$ in Equation (eq:def-dist) and $q_\sigma(\mathbf{z}_{t-1} | \mathbf{z}_t, \mathbf{z}_0, \mathbf{h})$ in Equation (eq:def-post-mean), we have:

Figures (6)

  • Figure 1: The proposed CSDM framework for cold-start problems in CTR prediction.
  • Figure 2: AUC scores evaluated across various stages for different backbone models, conducted over three datasets with 10 runs per model.
  • Figure 3: Performance evaluation on the TaobaoAD dataset using DeepFM as the backbone model across a range of $\rho$ values, mean of three runs.
  • Figure 4: The time cost for training one batch using various methods, with CSMD tested using both $s=5$ and $s=10$.
  • Figure 5: An illustration of the process for generating the warm-up embeddings.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof