Table of Contents
Fetching ...

Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Beibei Li, Tao Xiang, Beihong Jin, Yiyuan Zheng, Rui Zhao

TL;DR

This paper tackles the limitation of unimodal priors in variational autoencoder-based sequential recommendation by introducing SIGMA, a two-component framework that jointly learns a semantic Gaussian mixture prior and sequence representations. The Multi-Interest Extraction VAE (MIE-VAE) disentangles multiple user interests into unimodal Gaussians, while the Semantic Gaussian Mixture VAE (SGM-VAE) aligns the sequence posterior with a mixture prior derived from those interests, using a tailored KL term and reconstruction objective. Through comprehensive experiments on four public datasets, SIGMA consistently outperforms baselines and demonstrates improved diversity when balancing the mixture prior weight, with ablations highlighting the importance of the mixture prior, the orthogonal category constraint, and the interplay between MIE-VAE and SGM-VAE. The approach offers a principled way to model complex, multi-faceted user preferences in SR, with practical implications for more robust and diverse recommendations.

Abstract

Variational AutoEncoder (VAE) for Sequential Recommendation (SR), which learns a continuous distribution for each user-item interaction sequence rather than a determinate embedding, is robust against data deficiency and achieves significant performance. However, existing VAE-based SR models assume a unimodal Gaussian distribution as the prior distribution of sequence representations, leading to restricted capability to capture complex user interests and limiting recommendation performance when users have more than one interest. Due to that it is common for users to have multiple disparate interests, we argue that it is more reasonable to establish a multimodal prior distribution in SR scenarios instead of a unimodal one. Therefore, in this paper, we propose a novel VAE-based SR model named SIGMA. SIGMA assumes that the prior of sequence representation conforms to a Gaussian mixture distribution, where each component of the distribution semantically corresponds to one of multiple interests. For multi-interest elicitation, SIGMA includes a probabilistic multi-interest extraction module that learns a unimodal Gaussian distribution for each interest according to implicit item hyper-categories. Additionally, to incorporate the multimodal interests into sequence representation learning, SIGMA constructs a multi-interest-aware ELBO, which is compatible with the Gaussian mixture prior. Extensive experiments on public datasets demonstrate the effectiveness of SIGMA. The code is available at https://github.com/libeibei95/SIGMA.

Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

TL;DR

This paper tackles the limitation of unimodal priors in variational autoencoder-based sequential recommendation by introducing SIGMA, a two-component framework that jointly learns a semantic Gaussian mixture prior and sequence representations. The Multi-Interest Extraction VAE (MIE-VAE) disentangles multiple user interests into unimodal Gaussians, while the Semantic Gaussian Mixture VAE (SGM-VAE) aligns the sequence posterior with a mixture prior derived from those interests, using a tailored KL term and reconstruction objective. Through comprehensive experiments on four public datasets, SIGMA consistently outperforms baselines and demonstrates improved diversity when balancing the mixture prior weight, with ablations highlighting the importance of the mixture prior, the orthogonal category constraint, and the interplay between MIE-VAE and SGM-VAE. The approach offers a principled way to model complex, multi-faceted user preferences in SR, with practical implications for more robust and diverse recommendations.

Abstract

Variational AutoEncoder (VAE) for Sequential Recommendation (SR), which learns a continuous distribution for each user-item interaction sequence rather than a determinate embedding, is robust against data deficiency and achieves significant performance. However, existing VAE-based SR models assume a unimodal Gaussian distribution as the prior distribution of sequence representations, leading to restricted capability to capture complex user interests and limiting recommendation performance when users have more than one interest. Due to that it is common for users to have multiple disparate interests, we argue that it is more reasonable to establish a multimodal prior distribution in SR scenarios instead of a unimodal one. Therefore, in this paper, we propose a novel VAE-based SR model named SIGMA. SIGMA assumes that the prior of sequence representation conforms to a Gaussian mixture distribution, where each component of the distribution semantically corresponds to one of multiple interests. For multi-interest elicitation, SIGMA includes a probabilistic multi-interest extraction module that learns a unimodal Gaussian distribution for each interest according to implicit item hyper-categories. Additionally, to incorporate the multimodal interests into sequence representation learning, SIGMA constructs a multi-interest-aware ELBO, which is compatible with the Gaussian mixture prior. Extensive experiments on public datasets demonstrate the effectiveness of SIGMA. The code is available at https://github.com/libeibei95/SIGMA.

Paper Structure

This paper contains 31 sections, 18 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Architecture of SIGMA. SIGMA comprises of MIE-VAE and SGM-VAE. MIE-VAE aims to disentangle multiple interests and learn a unimodal Gaussian distribution for each of them, while SGM-VAE learns enhanced sequence representation by aligning with the semantic Gaussian mixture distribution composed of multiple interests.
  • Figure 2: Impact of category quantity.