Table of Contents
Fetching ...

Debiased Collaborative Filtering with Kernel-Based Causal Balancing

Haoxuan Li, Chunyuan Zheng, Yanghao Xiao, Peng Wu, Zhi Geng, Xu Chen, Peng Cui

TL;DR

This work tackles selection bias in collaborative filtering by reweighting observations through propensity scores and identifies gaps in existing strategies that fail to satisfy causal balancing constraints. It introduces kernel-based causal balancing in reproducing kernel Hilbert spaces (RKHS) and an adaptive kernel balancing (AKB) method to automatically emphasize balancing functions that matter most for reducing bias, backed by generalization guarantees. The authors define KBIPS and KBDR estimators, provide worst-case and adaptive balancing mechanisms, and establish finite-sample bounds within RKHS, demonstrating reduced bias and improved debiasing performance. Empirical results on Coat, Music, and Product show AKB-based methods consistently outperform strong baselines, validating the practicality and effectiveness of kernel-based balancing for debiased collaborative filtering, with code available at the project repository.

Abstract

Debiased collaborative filtering aims to learn an unbiased prediction model by removing different biases in observational datasets. To solve this problem, one of the simple and effective methods is based on the propensity score, which adjusts the observational sample distribution to the target one by reweighting observed instances. Ideally, propensity scores should be learned with causal balancing constraints. However, existing methods usually ignore such constraints or implement them with unreasonable approximations, which may affect the accuracy of the learned propensity scores. To bridge this gap, in this paper, we first analyze the gaps between the causal balancing requirements and existing methods such as learning the propensity with cross-entropy loss or manually selecting functions to balance. Inspired by these gaps, we propose to approximate the balancing functions in reproducing kernel Hilbert space and demonstrate that, based on the universal property and representer theorem of kernel functions, the causal balancing constraints can be better satisfied. Meanwhile, we propose an algorithm that adaptively balances the kernel function and theoretically analyze the generalization error bound of our methods. We conduct extensive experiments to demonstrate the effectiveness of our methods, and to promote this research direction, we have released our project at https://github.com/haoxuanli-pku/ICLR24-Kernel-Balancing.

Debiased Collaborative Filtering with Kernel-Based Causal Balancing

TL;DR

This work tackles selection bias in collaborative filtering by reweighting observations through propensity scores and identifies gaps in existing strategies that fail to satisfy causal balancing constraints. It introduces kernel-based causal balancing in reproducing kernel Hilbert spaces (RKHS) and an adaptive kernel balancing (AKB) method to automatically emphasize balancing functions that matter most for reducing bias, backed by generalization guarantees. The authors define KBIPS and KBDR estimators, provide worst-case and adaptive balancing mechanisms, and establish finite-sample bounds within RKHS, demonstrating reduced bias and improved debiasing performance. Empirical results on Coat, Music, and Product show AKB-based methods consistently outperform strong baselines, validating the practicality and effectiveness of kernel-based balancing for debiased collaborative filtering, with code available at the project repository.

Abstract

Debiased collaborative filtering aims to learn an unbiased prediction model by removing different biases in observational datasets. To solve this problem, one of the simple and effective methods is based on the propensity score, which adjusts the observational sample distribution to the target one by reweighting observed instances. Ideally, propensity scores should be learned with causal balancing constraints. However, existing methods usually ignore such constraints or implement them with unreasonable approximations, which may affect the accuracy of the learned propensity scores. To bridge this gap, in this paper, we first analyze the gaps between the causal balancing requirements and existing methods such as learning the propensity with cross-entropy loss or manually selecting functions to balance. Inspired by these gaps, we propose to approximate the balancing functions in reproducing kernel Hilbert space and demonstrate that, based on the universal property and representer theorem of kernel functions, the causal balancing constraints can be better satisfied. Meanwhile, we propose an algorithm that adaptively balances the kernel function and theoretically analyze the generalization error bound of our methods. We conduct extensive experiments to demonstrate the effectiveness of our methods, and to promote this research direction, we have released our project at https://github.com/haoxuanli-pku/ICLR24-Kernel-Balancing.
Paper Structure (16 sections, 4 theorems, 67 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 16 sections, 4 theorems, 67 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

If $e_{u, i} \in \mathcal{H}_J=\operatorname{span}\{h^{(1)}(\cdot), \ldots, h^{(J)}(\cdot)\}$, then the above learned propensities lead to an unbiased ideal loss estimation in term of the IPS method.

Figures (2)

  • Figure 1: Effects of the value of $J$ on AUC and NDCG@20 on the Product dataset.
  • Figure 2: Effects of hyper-parameter $\gamma$ on AUC and NDCG@$K$ on Music and Product datasets.

Theorems & Definitions (15)

  • Theorem 1
  • Definition 1: Kernel function
  • Definition 2: Universal kernel
  • Lemma 1: sriperumbudur2011universality
  • Lemma 2: Representer theorem
  • Theorem 2: Generalization Bounds in RKHS
  • proof
  • proof
  • proof
  • proof
  • ...and 5 more