Table of Contents
Fetching ...

MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation

Xu Guo, Tong Zhang, Fuyun Wang, Xudong Wang, Xiaoya Zhang, Xin Liu, Zhen Cui

TL;DR

MMHCL tackles data sparsity and cold-start in multimodal recommendation by introducing a dual-hypergraph framework that learns from user-to-user and item-to-item relationships. It combines a hypergraph-based representation with a standard collaborative-filtering backbone and a synergistic contrastive learning objective to align second-order and first-order signals. The approach yields dense, high-order semantic information and robust feature representations, achieving state-of-the-art results on three multimodal datasets with notable gains in Recall@20. This work highlights the value of higher-order hypergraph modeling and cross-view contrastive signals for practical, scalable multimodal recommendation.

Abstract

The burgeoning presence of multimodal content-sharing platforms propels the development of personalized recommender systems. Previous works usually suffer from data sparsity and cold-start problems, and may fail to adequately explore semantic user-product associations from multimodal data. To address these issues, we propose a novel Multi-Modal Hypergraph Contrastive Learning (MMHCL) framework for user recommendation. For a comprehensive information exploration from user-product relations, we construct two hypergraphs, i.e. a user-to-user (u2u) hypergraph and an item-to-item (i2i) hypergraph, to mine shared preferences among users and intricate multimodal semantic resemblance among items, respectively. This process yields denser second-order semantics that are fused with first-order user-item interaction as complementary to alleviate the data sparsity issue. Then, we design a contrastive feature enhancement paradigm by applying synergistic contrastive learning. By maximizing/minimizing the mutual information between second-order (e.g. shared preference pattern for users) and first-order (information of selected items for users) embeddings of the same/different users and items, the feature distinguishability can be effectively enhanced. Compared with using sparse primary user-item interaction only, our MMHCL obtains denser second-order hypergraphs and excavates more abundant shared attributes to explore the user-product associations, which to a certain extent alleviates the problems of data sparsity and cold-start. Extensive experiments have comprehensively demonstrated the effectiveness of our method. Our code is publicly available at: https://github.com/Xu107/MMHCL.

MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation

TL;DR

MMHCL tackles data sparsity and cold-start in multimodal recommendation by introducing a dual-hypergraph framework that learns from user-to-user and item-to-item relationships. It combines a hypergraph-based representation with a standard collaborative-filtering backbone and a synergistic contrastive learning objective to align second-order and first-order signals. The approach yields dense, high-order semantic information and robust feature representations, achieving state-of-the-art results on three multimodal datasets with notable gains in Recall@20. This work highlights the value of higher-order hypergraph modeling and cross-view contrastive signals for practical, scalable multimodal recommendation.

Abstract

The burgeoning presence of multimodal content-sharing platforms propels the development of personalized recommender systems. Previous works usually suffer from data sparsity and cold-start problems, and may fail to adequately explore semantic user-product associations from multimodal data. To address these issues, we propose a novel Multi-Modal Hypergraph Contrastive Learning (MMHCL) framework for user recommendation. For a comprehensive information exploration from user-product relations, we construct two hypergraphs, i.e. a user-to-user (u2u) hypergraph and an item-to-item (i2i) hypergraph, to mine shared preferences among users and intricate multimodal semantic resemblance among items, respectively. This process yields denser second-order semantics that are fused with first-order user-item interaction as complementary to alleviate the data sparsity issue. Then, we design a contrastive feature enhancement paradigm by applying synergistic contrastive learning. By maximizing/minimizing the mutual information between second-order (e.g. shared preference pattern for users) and first-order (information of selected items for users) embeddings of the same/different users and items, the feature distinguishability can be effectively enhanced. Compared with using sparse primary user-item interaction only, our MMHCL obtains denser second-order hypergraphs and excavates more abundant shared attributes to explore the user-product associations, which to a certain extent alleviates the problems of data sparsity and cold-start. Extensive experiments have comprehensively demonstrated the effectiveness of our method. Our code is publicly available at: https://github.com/Xu107/MMHCL.

Paper Structure

This paper contains 32 sections, 10 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Example of recommendation. We introduce a Hypergraph Neural Network (HGNN) for explicitly model shared preferences among users. The left side illustrates a complex user-item interaction graph where different colors indicate distinct hyperedges, each connecting multiple users and items. On the right, HGNN aggregates these second-order relationships to explicitly uncover deeper semantic connections, extracting denser, higher-order information.
  • Figure 2: The structure overview of the proposed MMHCL. A) The u2u hypergraph is constructed based on the user-item interactions for learning the user-level representations. B) The user/item embeddings acquired from the user/item level are fused with the embeddings learned from the downstream tasks. Meanwhile, synergistic contrastive learning is implemented. C) The i2i hypergraph is built using the raw multimodal features of items to learn the item-level representations. See the Overview section for details.
  • Figure 3: Performance in cold-start problem.
  • Figure 4: Performance w.r.t the effect of synergistic contrastive learning coefficients $\alpha$ for u2u hypergraph and $\beta$ for i2i hypergraph on Clothing and Sports datasets (Recall@20).
  • Figure 5: Performance comparison w.r.t various hyperparameters on the three datasets.
  • ...and 3 more figures