A Novel Generative Model with Causality Constraint for Mitigating Biases in Recommender Systems
Jianfeng Deng, Qingfeng Chen, Debo Cheng, Jiuyong Li, Lin Liu, Shichao Zhang
TL;DR
This paper tackles latent confounding bias in recommender systems by introducing LCDR, a generative framework that leverages an identifiable VAE (iVAE) to learn causally informative latent representations from proxy signals. LCDR constrains a latent-causality aware VAE (LCVAE) to align its latent space $Z_{lc}$ with the iVAE-derived $Z$ via an $\ell_2$-based penalty $\lambda\|Z_{lc}-Z\|_2$ in the ELBO, enabling robust use of noisy proxies to recover latent confounders. The method combines a matrix-factorisation–backbone with a causal-constrained latent representation to improve both bias mitigation and recommendation accuracy, validated on Coat, Yahoo!R3, and KuaiRand where LCDR consistently outperforms state-of-the-art baselines and ablations. The work also provides identifiability results for the learned representations under realistic conditions and demonstrates practical robustness to proxy quality, suggesting significant impact for real-world recommender systems facing latent confounding and sparse, biased data.
Abstract
Accurately predicting counterfactual user feedback is essential for building effective recommender systems. However, latent confounding bias can obscure the true causal relationship between user feedback and item exposure, ultimately degrading recommendation performance. Existing causal debiasing approaches often rely on strong assumptions-such as the availability of instrumental variables (IVs) or strong correlations between latent confounders and proxy variables-that are rarely satisfied in real-world scenarios. To address these limitations, we propose a novel generative framework called Latent Causality Constraints for Debiasing representation learning in Recommender Systems (LCDR). Specifically, LCDR leverages an identifiable Variational Autoencoder (iVAE) as a causal constraint to align the latent representations learned by a standard Variational Autoencoder (VAE) through a unified loss function. This alignment allows the model to leverage even weak or noisy proxy variables to recover latent confounders effectively. The resulting representations are then used to improve recommendation performance. Extensive experiments on three real-world datasets demonstrate that LCDR consistently outperforms existing methods in both mitigating bias and improving recommendation accuracy.
