Table of Contents
Fetching ...

DualVAE: Dual Disentangled Variational AutoEncoder for Recommendation

Zhiqiang Guo, Guohui Li, Jianjun Li, Chaoyang Wang, Si Shi

TL;DR

DualVAE tackles implicit feedback in collaborative filtering by learning disentangled, multi‑aspect representations for both users and items. It introduces four integrated components—Attention‑aware Dual Disentanglement (ADD), Disentangled Variational Inference (DVI), Joint Generation (JG), and Neighborhood‑enhanced Representation Constraint (NRC)—to model cross‑entity matching across $A$ aspects with latent dimension $d$, while enforcing correspondence and independence via contrastive learning. The method uses prototype‑driven aspect assignment and a Poisson likelihood with aspect‑weighted interactions, complemented by InfoNCE‑based neighborhood constraints to improve robustness in sparse data. Experiments on MovieLens‑1M, Kindle, and Yelp show consistent, statistically significant improvements over strong baselines and provide interpretable insights into the learned disentangled factors.

Abstract

Learning precise representations of users and items to fit observed interaction data is the fundamental task of collaborative filtering. Existing studies usually infer entangled representations to fit such interaction data, neglecting to model the diverse matching relationships between users and items behind their interactions, leading to limited performance and weak interpretability. To address this problem, we propose a Dual Disentangled Variational AutoEncoder (DualVAE) for collaborative recommendation, which combines disentangled representation learning with variational inference to facilitate the generation of implicit interaction data. Specifically, we first implement the disentangling concept by unifying an attention-aware dual disentanglement and disentangled variational autoencoder to infer the disentangled latent representations of users and items. Further, to encourage the correspondence and independence of disentangled representations of users and items, we design a neighborhood-enhanced representation constraint with a customized contrastive mechanism to improve the representation quality. Extensive experiments on three real-world benchmarks show that our proposed model significantly outperforms several recent state-of-the-art baselines. Further empirical experimental results also illustrate the interpretability of the disentangled representations learned by DualVAE.

DualVAE: Dual Disentangled Variational AutoEncoder for Recommendation

TL;DR

DualVAE tackles implicit feedback in collaborative filtering by learning disentangled, multi‑aspect representations for both users and items. It introduces four integrated components—Attention‑aware Dual Disentanglement (ADD), Disentangled Variational Inference (DVI), Joint Generation (JG), and Neighborhood‑enhanced Representation Constraint (NRC)—to model cross‑entity matching across aspects with latent dimension , while enforcing correspondence and independence via contrastive learning. The method uses prototype‑driven aspect assignment and a Poisson likelihood with aspect‑weighted interactions, complemented by InfoNCE‑based neighborhood constraints to improve robustness in sparse data. Experiments on MovieLens‑1M, Kindle, and Yelp show consistent, statistically significant improvements over strong baselines and provide interpretable insights into the learned disentangled factors.

Abstract

Learning precise representations of users and items to fit observed interaction data is the fundamental task of collaborative filtering. Existing studies usually infer entangled representations to fit such interaction data, neglecting to model the diverse matching relationships between users and items behind their interactions, leading to limited performance and weak interpretability. To address this problem, we propose a Dual Disentangled Variational AutoEncoder (DualVAE) for collaborative recommendation, which combines disentangled representation learning with variational inference to facilitate the generation of implicit interaction data. Specifically, we first implement the disentangling concept by unifying an attention-aware dual disentanglement and disentangled variational autoencoder to infer the disentangled latent representations of users and items. Further, to encourage the correspondence and independence of disentangled representations of users and items, we design a neighborhood-enhanced representation constraint with a customized contrastive mechanism to improve the representation quality. Extensive experiments on three real-world benchmarks show that our proposed model significantly outperforms several recent state-of-the-art baselines. Further empirical experimental results also illustrate the interpretability of the disentangled representations learned by DualVAE.
Paper Structure (28 sections, 15 equations, 4 figures, 3 tables)

This paper contains 28 sections, 15 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Illustration of (a) user-side VAE; (b) disentangled user-side VAE; and (c) explicit/implicit matching of multi-aspect features between users and items in a movie recommendation scenario. Words with different colors indicate different aspects.
  • Figure 2: Performance of varying parameters $A$ and $\gamma$ in terms of R@$20$ and N@$20$ on AKindle and Yelp.
  • Figure 3: Visualization of aspect probability maps of $u_{1259}$, $u_{5443}$ and their random $20$ interactive items on ML1M dataset.
  • Figure 4: Case studies of our proposed DualVAE on AKindle dataset. Red words reflect users' attitude in their reviews towards certain aspects of an item.