Table of Contents
Fetching ...

CounterCLR: Counterfactual Contrastive Learning with Non-random Missing Data in Recommendation

Jun Wang, Haoxuan Li, Chi Zhang, Dongxu Liang, Enyun Yu, Wenwu Ou, Wenjia Wang

TL;DR

CounterCLR tackles non-random missing data in recommendations by marrying a causality-driven predictor (CauNet) with a self-supervised contrastive objective. It operates under the Potential Outcome Framework to model exposure vs. non-exposure ratings and to mitigate selection bias and data sparsity without extra propensity estimators or unbiased MAR data. The three-headed CauNet predicts $\hat{r}_{u,i}^{1}$ and $\hat{r}_{u,i}^{0}$ and estimates $\hat{o}_{u,i}$, while a momentum mechanism aligns the exposure and non-exposure representations; a causally informed contrastive loss further enforces similarity of user preference embeddings across exposure statuses. Experiments on Coat, Yahoo! R3, and KuaiRec demonstrate that CounterCLR outperforms state-of-the-art debiasing methods, especially in settings with limited or no unbiased data, underscoring its practical robustness and scalability for real-world recommender systems.

Abstract

Recommender systems are designed to learn user preferences from observed feedback and comprise many fundamental tasks, such as rating prediction and post-click conversion rate (pCVR) prediction. However, the observed feedback usually suffer from two issues: selection bias and data sparsity, where biased and insufficient feedback seriously degrade the performance of recommender systems in terms of accuracy and ranking. Existing solutions for handling the issues, such as data imputation and inverse propensity score, are highly susceptible to additional trained imputation or propensity models. In this work, we propose a novel counterfactual contrastive learning framework for recommendation, named CounterCLR, to tackle the problem of non-random missing data by exploiting the advances in contrast learning. Specifically, the proposed CounterCLR employs a deep representation network, called CauNet, to infer non-random missing data in recommendations and perform user preference modeling by further introducing a self-supervised contrastive learning task. Our CounterCLR mitigates the selection bias problem without the need for additional models or estimators, while also enhancing the generalization ability in cases of sparse data. Experiments on real-world datasets demonstrate the effectiveness and superiority of our method.

CounterCLR: Counterfactual Contrastive Learning with Non-random Missing Data in Recommendation

TL;DR

CounterCLR tackles non-random missing data in recommendations by marrying a causality-driven predictor (CauNet) with a self-supervised contrastive objective. It operates under the Potential Outcome Framework to model exposure vs. non-exposure ratings and to mitigate selection bias and data sparsity without extra propensity estimators or unbiased MAR data. The three-headed CauNet predicts and and estimates , while a momentum mechanism aligns the exposure and non-exposure representations; a causally informed contrastive loss further enforces similarity of user preference embeddings across exposure statuses. Experiments on Coat, Yahoo! R3, and KuaiRec demonstrate that CounterCLR outperforms state-of-the-art debiasing methods, especially in settings with limited or no unbiased data, underscoring its practical robustness and scalability for real-world recommender systems.

Abstract

Recommender systems are designed to learn user preferences from observed feedback and comprise many fundamental tasks, such as rating prediction and post-click conversion rate (pCVR) prediction. However, the observed feedback usually suffer from two issues: selection bias and data sparsity, where biased and insufficient feedback seriously degrade the performance of recommender systems in terms of accuracy and ranking. Existing solutions for handling the issues, such as data imputation and inverse propensity score, are highly susceptible to additional trained imputation or propensity models. In this work, we propose a novel counterfactual contrastive learning framework for recommendation, named CounterCLR, to tackle the problem of non-random missing data by exploiting the advances in contrast learning. Specifically, the proposed CounterCLR employs a deep representation network, called CauNet, to infer non-random missing data in recommendations and perform user preference modeling by further introducing a self-supervised contrastive learning task. Our CounterCLR mitigates the selection bias problem without the need for additional models or estimators, while also enhancing the generalization ability in cases of sparse data. Experiments on real-world datasets demonstrate the effectiveness and superiority of our method.
Paper Structure (14 sections, 5 equations, 2 figures, 1 table)

This paper contains 14 sections, 5 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: The architecture of CounterCLR, including a causality-based prediction model with an auxiliary contrastive learning objective.
  • Figure 2: Rating prediction accuracy and recommendation quality with varying observed ratio of the Small matrix in KuaiRec Dataset.

Theorems & Definitions (1)

  • Example 1: Rating Prediction Task under POF