Table of Contents
Fetching ...

On Predicting Post-Click Conversion Rate via Counterfactual Inference

Junhyung Ahn, Sanghack Lee

TL;DR

This work tackles the challenge of predicting post-click CVR under sample selection bias and data sparsity by introducing ESCIM, a framework that generates counterfactual conversion labels for non-clicked samples via a structural causal model. ESCIM performs counterfactual inference using Abduction-Action-Prediction, leveraging a VAE to learn the posterior over latent exogenous factors and a pre-trained CVR predictor to estimate counterfactual CVRs for the non-clicked space; the predicted CVRs are transformed into hard labels through max or ratio strategies and integrated into a multi-task CVR objective. Empirical results on Ali-CCP and Ali-Express show consistent offline gains over state-of-the-art baselines, while online A/B tests demonstrate substantial improvements in CVR, CTCVR, and CPA, and analyses on latent conversion data confirm improved generalization to unseen user behavior. The approach offers robust handling of MNAR issues in CVR prediction and provides a principled pathway to utilize the full exposure space, with practical impact for better-targeted recommendations and advertising efficiency.

Abstract

Accurately predicting conversion rate (CVR) is essential in various recommendation domains such as online advertising systems and e-commerce. These systems utilize user interaction logs, which consist of exposures, clicks, and conversions. CVR prediction models are typically trained solely based on clicked samples, as conversions can only be determined following clicks. However, the sparsity of clicked instances necessitates the collection of a substantial amount of logs for effective model training. Recent works address this issue by devising frameworks that leverage non-clicked samples. While these frameworks aim to reduce biases caused by the discrepancy between clicked and non-clicked samples, they often rely on heuristics. Against this background, we propose a method to counterfactually generate conversion labels for non-clicked samples by using causality as a guiding principle, attempting to answer the question, "Would the user have converted if he or she had clicked the recommended item?" Our approach is named the Entire Space Counterfactual Inference Multi-task Model (ESCIM). We initially train a structural causal model (SCM) of user sequential behaviors and conduct a hypothetical intervention (i.e., click) on non-clicked items to infer counterfactual CVRs. We then introduce several approaches to transform predicted counterfactual CVRs into binary counterfactual conversion labels for the non-clicked samples. Finally, the generated samples are incorporated into the training process. Extensive experiments on public datasets illustrate the superiority of the proposed algorithm. Online A/B testing further empirically validates the effectiveness of our proposed algorithm in real-world scenarios. In addition, we demonstrate the improved performance of the proposed method on latent conversion data, showcasing its robustness and superior generalization capabilities.

On Predicting Post-Click Conversion Rate via Counterfactual Inference

TL;DR

This work tackles the challenge of predicting post-click CVR under sample selection bias and data sparsity by introducing ESCIM, a framework that generates counterfactual conversion labels for non-clicked samples via a structural causal model. ESCIM performs counterfactual inference using Abduction-Action-Prediction, leveraging a VAE to learn the posterior over latent exogenous factors and a pre-trained CVR predictor to estimate counterfactual CVRs for the non-clicked space; the predicted CVRs are transformed into hard labels through max or ratio strategies and integrated into a multi-task CVR objective. Empirical results on Ali-CCP and Ali-Express show consistent offline gains over state-of-the-art baselines, while online A/B tests demonstrate substantial improvements in CVR, CTCVR, and CPA, and analyses on latent conversion data confirm improved generalization to unseen user behavior. The approach offers robust handling of MNAR issues in CVR prediction and provides a principled pathway to utilize the full exposure space, with practical impact for better-targeted recommendations and advertising efficiency.

Abstract

Accurately predicting conversion rate (CVR) is essential in various recommendation domains such as online advertising systems and e-commerce. These systems utilize user interaction logs, which consist of exposures, clicks, and conversions. CVR prediction models are typically trained solely based on clicked samples, as conversions can only be determined following clicks. However, the sparsity of clicked instances necessitates the collection of a substantial amount of logs for effective model training. Recent works address this issue by devising frameworks that leverage non-clicked samples. While these frameworks aim to reduce biases caused by the discrepancy between clicked and non-clicked samples, they often rely on heuristics. Against this background, we propose a method to counterfactually generate conversion labels for non-clicked samples by using causality as a guiding principle, attempting to answer the question, "Would the user have converted if he or she had clicked the recommended item?" Our approach is named the Entire Space Counterfactual Inference Multi-task Model (ESCIM). We initially train a structural causal model (SCM) of user sequential behaviors and conduct a hypothetical intervention (i.e., click) on non-clicked items to infer counterfactual CVRs. We then introduce several approaches to transform predicted counterfactual CVRs into binary counterfactual conversion labels for the non-clicked samples. Finally, the generated samples are incorporated into the training process. Extensive experiments on public datasets illustrate the superiority of the proposed algorithm. Online A/B testing further empirically validates the effectiveness of our proposed algorithm in real-world scenarios. In addition, we demonstrate the improved performance of the proposed method on latent conversion data, showcasing its robustness and superior generalization capabilities.

Paper Structure

This paper contains 40 sections, 12 equations, 5 figures, 6 tables, 3 algorithms.

Figures (5)

  • Figure 1: An example of the data sparsity and selection bias of the CVR estimation task, where the training space $\mathcal{C}$ only contains clicked samples, while the inference space $\mathcal{D}$ consists of all exposed samples.
  • Figure 2: Overall procedure of ESCIM. (a) $Z$ represents the exogenous variable for the conversion event, acting as an underlying factor of user conversion behavior. (b) The hypothetical intervention on $C$ simulates a scenario where the click event is set to 1 to generate a counterfactual conversion outcome. (c) Counterfactual labels for non-clicked samples are generated through counterfactual inference and label transformation (bottom). The generated labels are then utilized to train a CVR prediction model (top).
  • Figure 3: The CVR and CTCVR AUC for different thresholds on the AE-FR dataset.
  • Figure 4: Distributions of pCVR and pCTCVR on latent conversion data in the Ali-CCP dataset
  • Figure 5: The CVR and CTCVR AUC for different values of $\alpha_{CF}$ on the Ali-CCP dataset.