Table of Contents
Fetching ...

Modality Reliability Guided Multimodal Recommendation

Xue Dong, Xuemeng Song, Na Zheng, Sicheng Zhao, Guiguang Ding

TL;DR

This work tackles the degradation of multimodal recommendations caused by unreliable modality data. It introduces MARGO, a modality reliability guided framework that learns modality weights under supervision derived from modality-level differences in positive versus negative item predictions within the BPR objective, complemented by a dynamic weight calibration loss and a two-stage training procedure. The approach yields statistically consistent improvements over state-of-the-art baselines on three Amazon datasets, with insights into modality contributions and weight behavior. The findings demonstrate that reliability-guided fusion can effectively mitigate modality noise and enhance personalization in multimodal recommender systems.

Abstract

Multimodal recommendation faces an issue of the performance degradation that the uni-modal recommendation sometimes achieves the better performance. A possible reason is that the unreliable item modality data hurts the fusion result. Several existing studies have introduced weights for different modalities to reduce the contribution of the unreliable modality data in predicting the final user rating. However, they fail to provide appropriate supervisions for learning the modality weights, making the learned weights imprecise. Therefore, we propose a modality reliability guided multimodal recommendation framework that uniquely learns the modality weights supervised by the modality reliability. Considering that there is no explicit label provided for modality reliability, we resort to automatically identify it through the BPR recommendation objective. In particular, we define a modality reliability vector as the supervision label by the difference between modality-specific user ratings to positive and negative items, where a larger difference indicates a higher reliability of the modality as the BPR objective is better satisfied. Furthermore, to enhance the effectiveness of the supervision, we calculate the confidence level for the modality reliability vector, which dynamically adjusts the supervision strength and eliminates the harmful supervision. Extensive experiments on three real-world datasets show the effectiveness of the proposed method.

Modality Reliability Guided Multimodal Recommendation

TL;DR

This work tackles the degradation of multimodal recommendations caused by unreliable modality data. It introduces MARGO, a modality reliability guided framework that learns modality weights under supervision derived from modality-level differences in positive versus negative item predictions within the BPR objective, complemented by a dynamic weight calibration loss and a two-stage training procedure. The approach yields statistically consistent improvements over state-of-the-art baselines on three Amazon datasets, with insights into modality contributions and weight behavior. The findings demonstrate that reliability-guided fusion can effectively mitigate modality noise and enhance personalization in multimodal recommender systems.

Abstract

Multimodal recommendation faces an issue of the performance degradation that the uni-modal recommendation sometimes achieves the better performance. A possible reason is that the unreliable item modality data hurts the fusion result. Several existing studies have introduced weights for different modalities to reduce the contribution of the unreliable modality data in predicting the final user rating. However, they fail to provide appropriate supervisions for learning the modality weights, making the learned weights imprecise. Therefore, we propose a modality reliability guided multimodal recommendation framework that uniquely learns the modality weights supervised by the modality reliability. Considering that there is no explicit label provided for modality reliability, we resort to automatically identify it through the BPR recommendation objective. In particular, we define a modality reliability vector as the supervision label by the difference between modality-specific user ratings to positive and negative items, where a larger difference indicates a higher reliability of the modality as the BPR objective is better satisfied. Furthermore, to enhance the effectiveness of the supervision, we calculate the confidence level for the modality reliability vector, which dynamically adjusts the supervision strength and eliminates the harmful supervision. Extensive experiments on three real-world datasets show the effectiveness of the proposed method.

Paper Structure

This paper contains 21 sections, 10 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Examples of items with unreliable modality data.
  • Figure 2: Illustration of the proposed MARGO framework, which adds explicit supervisions to the modality weight learning. Specifically, we introduce a new modality reliability vector through the training objective to simulate the reliability of the modality data. We also involve the confidence for the modality reliability vector to ensure the effectiveness of the supervision.
  • Figure 3: Illustration of the weight calibration process.
  • Figure 4: Performance of the proposed MARGO with respect to different trade-off parameter $\alpha$ on three datasets. The x-axis refers to the values of $\alpha$. The left and right y-axis refer to the Recall@20 and NDCG@20, respectively.
  • Figure 5: Histogram of the weights of images learned by the best baseline DRAGON, variant w/o stopgrad, and our proposed method MARGO. The x-axis and y-axis refer to the weight value and item number, respectively.
  • ...and 1 more figures