Table of Contents
Fetching ...

Variational Bayesian Personalized Ranking

Bin Liu, Xiaohong Liu, Qin Luo, Ziqiao Shang, Jielei Chu, Lin Ma, Zhaoyu Li, Fei Teng, Guangtao Zhai, Tianrui Li

TL;DR

VarBPR reframes implicit-feedback pairwise learning as variational inference over discrete latent indices, addressing noise, sparse supervision, and exposure bias. It introduces a two-stage approach with closed-form variational posteriors that yield endogenous exposure control through priors and temperature, while a posterior-compression objective enables linear-time learning. Theoretical results decompose generalization into an indexing error and an opportunity-cost term for exposure patterns, providing a principled tuning guide for controllable recommendations. Empirically, VarBPR improves ranking across backbones, enables controllable long-tail exposure, and retains linear-time efficiency, making it practical for large-scale deployable recommender systems.

Abstract

Pairwise learning underpins implicit collaborative filtering, yet its effectiveness is often hindered by sparse supervision, noisy interactions, and popularity-driven exposure bias. In this paper, we propose Variational Bayesian Personalized Ranking (VarBPR), a tractable variational framework for implicit-feedback pairwise learning that offers principled exposure controllability and theoretical interpretability. VarBPR reformulates pairwise learning as variational inference over discrete latent indexing variables, explicitly modeling noise and indexing uncertainty, and divides training into two stages: variational inference and variational learning. In the variational inference stage, we develop a variational formulation that integrates preference alignment, denoising, and popularity debiasing under a unified ELBO/regularization objective, deriving closed-form posteriors with clear control semantics: the prior encodes a target exposure pattern, while temperature/regularization strength controls posterior-prior adherence. As a result, exposure controllability becomes an endogenous and interpretable outcome of variational inference. In the variational learning stage, we propose a posterior-compression objective that reduces the ideal ELBO's computational complexity from polynomial to linear, with the approximation justified by an explicit Jensen-gap upper bound. Theoretically, we provide interpretable generalization guarantees by identifying a structural error component and revealing the opportunity cost of prioritizing certain exposure patterns (e.g., long-tail), offering a concrete analytical lens for designing controllable recommender systems. Empirically, we validate VarBPR across popular backbones; it demonstrates consistent gains in ranking accuracy, enables controlled long-tail exposure, and preserves the linear-time complexity of BPR.

Variational Bayesian Personalized Ranking

TL;DR

VarBPR reframes implicit-feedback pairwise learning as variational inference over discrete latent indices, addressing noise, sparse supervision, and exposure bias. It introduces a two-stage approach with closed-form variational posteriors that yield endogenous exposure control through priors and temperature, while a posterior-compression objective enables linear-time learning. Theoretical results decompose generalization into an indexing error and an opportunity-cost term for exposure patterns, providing a principled tuning guide for controllable recommendations. Empirically, VarBPR improves ranking across backbones, enables controllable long-tail exposure, and retains linear-time efficiency, making it practical for large-scale deployable recommender systems.

Abstract

Pairwise learning underpins implicit collaborative filtering, yet its effectiveness is often hindered by sparse supervision, noisy interactions, and popularity-driven exposure bias. In this paper, we propose Variational Bayesian Personalized Ranking (VarBPR), a tractable variational framework for implicit-feedback pairwise learning that offers principled exposure controllability and theoretical interpretability. VarBPR reformulates pairwise learning as variational inference over discrete latent indexing variables, explicitly modeling noise and indexing uncertainty, and divides training into two stages: variational inference and variational learning. In the variational inference stage, we develop a variational formulation that integrates preference alignment, denoising, and popularity debiasing under a unified ELBO/regularization objective, deriving closed-form posteriors with clear control semantics: the prior encodes a target exposure pattern, while temperature/regularization strength controls posterior-prior adherence. As a result, exposure controllability becomes an endogenous and interpretable outcome of variational inference. In the variational learning stage, we propose a posterior-compression objective that reduces the ideal ELBO's computational complexity from polynomial to linear, with the approximation justified by an explicit Jensen-gap upper bound. Theoretically, we provide interpretable generalization guarantees by identifying a structural error component and revealing the opportunity cost of prioritizing certain exposure patterns (e.g., long-tail), offering a concrete analytical lens for designing controllable recommender systems. Empirically, we validate VarBPR across popular backbones; it demonstrates consistent gains in ranking accuracy, enables controlled long-tail exposure, and preserves the linear-time complexity of BPR.

Paper Structure

This paper contains 30 sections, 3 theorems, 51 equations, 10 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

The closed-form optima of Eqs. eq:var-pos and eq:var-neg are

Figures (10)

  • Figure 1: Illustration of VarBPR. Unobservable preferences complicate the likelihood estimation. We establish a unified variational framework that integrates preference alignment, denoising, debiasing from ELBO with closed-form variational posteriors. This yields VarBPR objective for indirect likelihood maximization, featuring analytical inference procedures with explicit exposure control mechanisms.
  • Figure 2: Noise mitigation effect. Left: Instance-level contrast. Right: Variational attention-based prototype contrast, entropy regularization constrains the impact of noisy samples by limiting their weights smaller than 1.
  • Figure 3: Variational inference enables controllable hard sample mining for both positive and negative sample.
  • Figure 4: Variational posterior-prior alignment controlled by regulation strength $(c_{\mathrm{pos}},c_{\mathrm{neg}})$ and bag size $(M,N)$.
  • Figure 5: The prior exposure pattern determines the empirical exposure distribution under high-compliance settings, with two notable orientations: long-tail oriented (a) and high-quality oriented (b).
  • ...and 5 more figures

Theorems & Definitions (5)

  • Lemma 1: Variational optimum
  • Proof 1
  • Proposition 1: Controlled Jensen gap
  • Theorem 1: Generalization bound
  • Proof 2