Table of Contents
Fetching ...

CETN: Contrast-enhanced Through Network for CTR Prediction

Honghao Li, Lei Sang, Yi Zhang, Xuyun Zhang, Yiwen Zhang

TL;DR

This paper targets CTR prediction, where existing parallel-structure models struggle due to weak supervisory signals and noisy multi-view interactions. It introduces CETN, a simple yet effective framework that combines product-based feature interactions, product-&-perturbation segmentation, a Denominator-only InfoNCE (Do-InfoNCE) loss, and a Through Network to foster diversity and homogeneity across multiple semantic spaces. The approach leverages six parallel Key-Value Blocks, spacing activations to diversify representations while using a fusion layer to produce the final CTR estimate, guided by an augmented self-supervised objective and a cosine-based homogeneity constraint. Across four real datasets, CETN consistently outperforms twenty baselines in both AUC and Logloss, highlighting the practicality and scalability of contrastive learning for multi-semantic space CTR modeling and suggesting broad applicability in industrial recommender systems.

Abstract

Click-through rate (CTR) Prediction is a crucial task in personalized information retrievals, such as industrial recommender systems, online advertising, and web search. Most existing CTR Prediction models utilize explicit feature interactions to overcome the performance bottleneck of implicit feature interactions. Hence, deep CTR models based on parallel structures (e.g., DCN, FinalMLP, xDeepFM) have been proposed to obtain joint information from different semantic spaces. However, these parallel subcomponents lack effective supervisory signals, making it challenging to efficiently capture valuable multi-views feature interaction information in different semantic spaces. To address this issue, we propose a simple yet effective novel CTR model: Contrast-enhanced Through Network for CTR (CETN), so as to ensure the diversity and homogeneity of feature interaction information. Specifically, CETN employs product-based feature interactions and the augmentation (perturbation) concept from contrastive learning to segment different semantic spaces, each with distinct activation functions. This improves diversity in the feature interaction information captured by the model. Additionally, we introduce self-supervised signals and through connection within each semantic space to ensure the homogeneity of the captured feature interaction information. The experiments and research conducted on four real datasets demonstrate that our model consistently outperforms twenty baseline models in terms of AUC and Logloss.

CETN: Contrast-enhanced Through Network for CTR Prediction

TL;DR

This paper targets CTR prediction, where existing parallel-structure models struggle due to weak supervisory signals and noisy multi-view interactions. It introduces CETN, a simple yet effective framework that combines product-based feature interactions, product-&-perturbation segmentation, a Denominator-only InfoNCE (Do-InfoNCE) loss, and a Through Network to foster diversity and homogeneity across multiple semantic spaces. The approach leverages six parallel Key-Value Blocks, spacing activations to diversify representations while using a fusion layer to produce the final CTR estimate, guided by an augmented self-supervised objective and a cosine-based homogeneity constraint. Across four real datasets, CETN consistently outperforms twenty baselines in both AUC and Logloss, highlighting the practicality and scalability of contrastive learning for multi-semantic space CTR modeling and suggesting broad applicability in industrial recommender systems.

Abstract

Click-through rate (CTR) Prediction is a crucial task in personalized information retrievals, such as industrial recommender systems, online advertising, and web search. Most existing CTR Prediction models utilize explicit feature interactions to overcome the performance bottleneck of implicit feature interactions. Hence, deep CTR models based on parallel structures (e.g., DCN, FinalMLP, xDeepFM) have been proposed to obtain joint information from different semantic spaces. However, these parallel subcomponents lack effective supervisory signals, making it challenging to efficiently capture valuable multi-views feature interaction information in different semantic spaces. To address this issue, we propose a simple yet effective novel CTR model: Contrast-enhanced Through Network for CTR (CETN), so as to ensure the diversity and homogeneity of feature interaction information. Specifically, CETN employs product-based feature interactions and the augmentation (perturbation) concept from contrastive learning to segment different semantic spaces, each with distinct activation functions. This improves diversity in the feature interaction information captured by the model. Additionally, we introduce self-supervised signals and through connection within each semantic space to ensure the homogeneity of the captured feature interaction information. The experiments and research conducted on four real datasets demonstrate that our model consistently outperforms twenty baseline models in terms of AUC and Logloss.
Paper Structure (40 sections, 26 equations, 13 figures, 8 tables, 1 algorithm)

This paper contains 40 sections, 26 equations, 13 figures, 8 tables, 1 algorithm.

Figures (13)

  • Figure 1: The architecture of three strong baseline models with parallel structure: DCN, FinalMLP, and xDeepFM.
  • Figure 2: The primary backbone structures of common CTR prediction models
  • Figure 3: An illustration of the diversity and homogeneity in $\mathbb{R}^{2}$.
  • Figure 4: The architecture of CETN
  • Figure 5: The architecture of simMHN
  • ...and 8 more figures