Table of Contents
Fetching ...

LGMRec: Local and Global Graph Learning for Multimodal Recommendation

Zhiqiang Guo, Jianjun Li, Guohui Li, Chaoyang Wang, Si Shi, Bin Ruan

TL;DR

LGMRec tackles multimodal recommendation by decoupling local user interests into collaborative and modality-specific graphs while leveraging a global hypergraph to capture modality-aware dependencies. The local graphs separately propagate ID-based collaborative signals and transformed modal features, and a hypergraph module learns cross-modal global relations with hyperedge attention and contrastive learning. Fusion of local and global embeddings, guided by an adaptive weight and a BPR-based objective augmented with hypergraph contrastive loss, yields robust performance, especially under data sparsity. Experiments on three Amazon-based datasets show consistent improvements over state-of-the-art baselines, validating the effectiveness of modeling both local structure and global modality-aware dependencies for accurate and robust recommendations.

Abstract

The multimodal recommendation has gradually become the infrastructure of online media platforms, enabling them to provide personalized service to users through a joint modeling of user historical behaviors (e.g., purchases, clicks) and item various modalities (e.g., visual and textual). The majority of existing studies typically focus on utilizing modal features or modal-related graph structure to learn user local interests. Nevertheless, these approaches encounter two limitations: (1) Shared updates of user ID embeddings result in the consequential coupling between collaboration and multimodal signals; (2) Lack of exploration into robust global user interests to alleviate the sparse interaction problems faced by local interest modeling. To address these issues, we propose a novel Local and Global Graph Learning-guided Multimodal Recommender (LGMRec), which jointly models local and global user interests. Specifically, we present a local graph embedding module to independently learn collaborative-related and modality-related embeddings of users and items with local topological relations. Moreover, a global hypergraph embedding module is designed to capture global user and item embeddings by modeling insightful global dependency relations. The global embeddings acquired within the hypergraph embedding space can then be combined with two decoupled local embeddings to improve the accuracy and robustness of recommendations. Extensive experiments conducted on three benchmark datasets demonstrate the superiority of our LGMRec over various state-of-the-art recommendation baselines, showcasing its effectiveness in modeling both local and global user interests.

LGMRec: Local and Global Graph Learning for Multimodal Recommendation

TL;DR

LGMRec tackles multimodal recommendation by decoupling local user interests into collaborative and modality-specific graphs while leveraging a global hypergraph to capture modality-aware dependencies. The local graphs separately propagate ID-based collaborative signals and transformed modal features, and a hypergraph module learns cross-modal global relations with hyperedge attention and contrastive learning. Fusion of local and global embeddings, guided by an adaptive weight and a BPR-based objective augmented with hypergraph contrastive loss, yields robust performance, especially under data sparsity. Experiments on three Amazon-based datasets show consistent improvements over state-of-the-art baselines, validating the effectiveness of modeling both local structure and global modality-aware dependencies for accurate and robust recommendations.

Abstract

The multimodal recommendation has gradually become the infrastructure of online media platforms, enabling them to provide personalized service to users through a joint modeling of user historical behaviors (e.g., purchases, clicks) and item various modalities (e.g., visual and textual). The majority of existing studies typically focus on utilizing modal features or modal-related graph structure to learn user local interests. Nevertheless, these approaches encounter two limitations: (1) Shared updates of user ID embeddings result in the consequential coupling between collaboration and multimodal signals; (2) Lack of exploration into robust global user interests to alleviate the sparse interaction problems faced by local interest modeling. To address these issues, we propose a novel Local and Global Graph Learning-guided Multimodal Recommender (LGMRec), which jointly models local and global user interests. Specifically, we present a local graph embedding module to independently learn collaborative-related and modality-related embeddings of users and items with local topological relations. Moreover, a global hypergraph embedding module is designed to capture global user and item embeddings by modeling insightful global dependency relations. The global embeddings acquired within the hypergraph embedding space can then be combined with two decoupled local embeddings to improve the accuracy and robustness of recommendations. Extensive experiments conducted on three benchmark datasets demonstrate the superiority of our LGMRec over various state-of-the-art recommendation baselines, showcasing its effectiveness in modeling both local and global user interests.
Paper Structure (35 sections, 15 equations, 6 figures, 4 tables)

This paper contains 35 sections, 15 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Illustrations of (a) sharing of user ID embeddings, (b) the gradient comparison of user ID embeddings updated from different models during training, (c) local user-item interaction graph, and (d) global dependencies between users and attributions. Darker lines indicate greater user interest.
  • Figure 2: The framework of the proposed LGMRec with visual and textual modalities of items (i.e., $m\in \{v,t\}$).
  • Figure 3: Performance w.r.t. different user interaction sparsity degrees in terms of R@$20$ on Baby and Sports datasets.
  • Figure 4: Performances under different settings of two key hyperparameters ($A$ and $\alpha$) on Clothing datasets.
  • Figure 5: Case study of learned global dependencies of two users $u_{1344}$ and $u_{4351}$ with four hyperedges on Baby dataset.
  • ...and 1 more figures