Table of Contents
Fetching ...

Hard vs. Noise: Resolving Hard-Noisy Sample Confusion in Recommender Systems via Large Language Models

Tianrui Song, Wen-Shuo Chao, Hao Liu

TL;DR

This work investigates hard-noisy sample confusion in implicit-feedback recommender systems and introduces LLMHNI, which leverages two auxiliary signals from large language models: semantic relevance from LLM-encoded embeddings and logical relevance from LLM-inferred interactions. The framework comprises semantic relevance guided hard negative mining with objective-aligned embeddings and logical relevance guided interaction denoising with cross-graph contrastive alignment and hallucination-robust learning, jointly optimized as $\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{rec}} + \lambda_{1} \mathcal{L}_{\text{de}} + \lambda_{2} \mathcal{L}_{\text{hal}}$. Extensive experiments on three real-world datasets and two backbone recommenders show significant improvements in denoising and recommendations, along with robust performance under increasing noise levels. By addressing objective mismatch and hallucination-induced errors through embedding alignment and cross-graph contrastive strategies, the approach demonstrates strong practical impact for more reliable and effective recommender systems.

Abstract

Implicit feedback, employed in training recommender systems, unavoidably confronts noise due to factors such as misclicks and position bias. Previous studies have attempted to identify noisy samples through their diverged data patterns, such as higher loss values, and mitigate their influence through sample dropping or reweighting. However, we observed that noisy samples and hard samples display similar patterns, leading to hard-noisy confusion issue. Such confusion is problematic as hard samples are vital for modeling user preferences. To solve this problem, we propose LLMHNI framework, leveraging two auxiliary user-item relevance signals generated by Large Language Models (LLMs) to differentiate hard and noisy samples. LLMHNI obtains user-item semantic relevance from LLM-encoded embeddings, which is used in negative sampling to select hard negatives while filtering out noisy false negatives. An objective alignment strategy is proposed to project LLM-encoded embeddings, originally for general language tasks, into a representation space optimized for user-item relevance modeling. LLMHNI also exploits LLM-inferred logical relevance within user-item interactions to identify hard and noisy samples. These LLM-inferred interactions are integrated into the interaction graph and guide denoising with cross-graph contrastive alignment. To eliminate the impact of unreliable interactions induced by LLM hallucination, we propose a graph contrastive learning strategy that aligns representations from randomly edge-dropped views to suppress unreliable edges. Empirical results demonstrate that LLMHNI significantly improves denoising and recommendation performance.

Hard vs. Noise: Resolving Hard-Noisy Sample Confusion in Recommender Systems via Large Language Models

TL;DR

This work investigates hard-noisy sample confusion in implicit-feedback recommender systems and introduces LLMHNI, which leverages two auxiliary signals from large language models: semantic relevance from LLM-encoded embeddings and logical relevance from LLM-inferred interactions. The framework comprises semantic relevance guided hard negative mining with objective-aligned embeddings and logical relevance guided interaction denoising with cross-graph contrastive alignment and hallucination-robust learning, jointly optimized as . Extensive experiments on three real-world datasets and two backbone recommenders show significant improvements in denoising and recommendations, along with robust performance under increasing noise levels. By addressing objective mismatch and hallucination-induced errors through embedding alignment and cross-graph contrastive strategies, the approach demonstrates strong practical impact for more reliable and effective recommender systems.

Abstract

Implicit feedback, employed in training recommender systems, unavoidably confronts noise due to factors such as misclicks and position bias. Previous studies have attempted to identify noisy samples through their diverged data patterns, such as higher loss values, and mitigate their influence through sample dropping or reweighting. However, we observed that noisy samples and hard samples display similar patterns, leading to hard-noisy confusion issue. Such confusion is problematic as hard samples are vital for modeling user preferences. To solve this problem, we propose LLMHNI framework, leveraging two auxiliary user-item relevance signals generated by Large Language Models (LLMs) to differentiate hard and noisy samples. LLMHNI obtains user-item semantic relevance from LLM-encoded embeddings, which is used in negative sampling to select hard negatives while filtering out noisy false negatives. An objective alignment strategy is proposed to project LLM-encoded embeddings, originally for general language tasks, into a representation space optimized for user-item relevance modeling. LLMHNI also exploits LLM-inferred logical relevance within user-item interactions to identify hard and noisy samples. These LLM-inferred interactions are integrated into the interaction graph and guide denoising with cross-graph contrastive alignment. To eliminate the impact of unreliable interactions induced by LLM hallucination, we propose a graph contrastive learning strategy that aligns representations from randomly edge-dropped views to suppress unreliable edges. Empirical results demonstrate that LLMHNI significantly improves denoising and recommendation performance.

Paper Structure

This paper contains 36 sections, 24 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: On the left, we demonstrate that hard and noisy samples display similar patterns in both loss values and prediction scores throughout the training process. On the right, we take the results from the 5th epoch as an example to illustrate how the prediction scores and loss values of hard and noisy samples overlap in distribution. Additional details about this figure can be found in the appendix.
  • Figure 2: Semantic Relevance and Logical Relevance.
  • Figure 3: The overview of our proposed LLMHNI framework.
  • Figure 4: Model performance $w.r.t$ different noise ratio. The bar chart represents Recall values (see left y-axis), while the line chart shows Drop Rate (see right y-axis) All denoise methods are trained with the LightGCN backbone.
  • Figure 5: Hyper-parameter analysis of $\text{LLMHNI}$ with LightGCN backbone on the Amazon-books datasets.
  • ...and 5 more figures