DualVAE: Dual Disentangled Variational AutoEncoder for Recommendation
Zhiqiang Guo, Guohui Li, Jianjun Li, Chaoyang Wang, Si Shi
TL;DR
DualVAE tackles implicit feedback in collaborative filtering by learning disentangled, multi‑aspect representations for both users and items. It introduces four integrated components—Attention‑aware Dual Disentanglement (ADD), Disentangled Variational Inference (DVI), Joint Generation (JG), and Neighborhood‑enhanced Representation Constraint (NRC)—to model cross‑entity matching across $A$ aspects with latent dimension $d$, while enforcing correspondence and independence via contrastive learning. The method uses prototype‑driven aspect assignment and a Poisson likelihood with aspect‑weighted interactions, complemented by InfoNCE‑based neighborhood constraints to improve robustness in sparse data. Experiments on MovieLens‑1M, Kindle, and Yelp show consistent, statistically significant improvements over strong baselines and provide interpretable insights into the learned disentangled factors.
Abstract
Learning precise representations of users and items to fit observed interaction data is the fundamental task of collaborative filtering. Existing studies usually infer entangled representations to fit such interaction data, neglecting to model the diverse matching relationships between users and items behind their interactions, leading to limited performance and weak interpretability. To address this problem, we propose a Dual Disentangled Variational AutoEncoder (DualVAE) for collaborative recommendation, which combines disentangled representation learning with variational inference to facilitate the generation of implicit interaction data. Specifically, we first implement the disentangling concept by unifying an attention-aware dual disentanglement and disentangled variational autoencoder to infer the disentangled latent representations of users and items. Further, to encourage the correspondence and independence of disentangled representations of users and items, we design a neighborhood-enhanced representation constraint with a customized contrastive mechanism to improve the representation quality. Extensive experiments on three real-world benchmarks show that our proposed model significantly outperforms several recent state-of-the-art baselines. Further empirical experimental results also illustrate the interpretability of the disentangled representations learned by DualVAE.
