Table of Contents
Fetching ...

Large Language Model Enhanced Graph Invariant Contrastive Learning for Out-of-Distribution Recommendation

Jiahao Liang, Haoran Yang, Xiangyu Zhao, Zhiwen Yu, Mianjie Li, Chuan Shi, Kaixiang Yang

TL;DR

The paper tackles robust out-of-distribution (OOD) generalization in graph-based recommender systems by introducing InvGCLLM, a framework that synergizes data-driven invariant learning with knowledge-rich Large Language Models (LLMs). It decomposes user-item interactions into invariant and variant components, uses an LLM to calibrate graph edits for a purified causal/spurious view pair, and learns representations via a Causal-Informed Contrastive Learning objective. Theoretical analysis and extensive experiments on multiple datasets show that InvGCLLM outperforms state-of-the-art baselines under distribution shifts, with ablations confirming the contributions of LLM-based editing and invariant learning. The approach provides a practical pathway to robust, explainable OOD recommendations by combining graph structure refinement with principled causal learning.

Abstract

Out-of-distribution (OOD) generalization has emerged as a significant challenge in graph recommender systems. Traditional graph neural network algorithms often fail because they learn spurious environmental correlations instead of stable causal relationships, leading to substantial performance degradation under distribution shifts. While recent advancements in Large Language Models (LLMs) offer a promising avenue due to their vast world knowledge and reasoning capabilities, effectively integrating this knowledge with the fine-grained topology of specific graphs to solve the OOD problem remains a significant challenge. To address these issues, we propose {$\textbf{Inv}$ariant $\textbf{G}$raph $\textbf{C}$ontrastive Learning with $\textbf{LLM}$s for Out-of-Distribution Recommendation (InvGCLLM)}, an innovative causal learning framework that synergistically integrates the strengths of data-driven models and knowledge-driven LLMs. Our framework first employs a data-driven invariant learning model to generate causal confidence scores for each user-item interaction. These scores then guide an LLM to perform targeted graph refinement, leveraging its world knowledge to prune spurious connections and augment missing causal links. Finally, the structurally purified graphs provide robust supervision for a causality-guided contrastive learning objective, enabling the model to learn representations that are resilient to spurious correlations. Experiments conducted on four public datasets demonstrate that InvGCLLM achieves significant improvements in out-of-distribution recommendation, consistently outperforming state-of-the-art baselines.

Large Language Model Enhanced Graph Invariant Contrastive Learning for Out-of-Distribution Recommendation

TL;DR

The paper tackles robust out-of-distribution (OOD) generalization in graph-based recommender systems by introducing InvGCLLM, a framework that synergizes data-driven invariant learning with knowledge-rich Large Language Models (LLMs). It decomposes user-item interactions into invariant and variant components, uses an LLM to calibrate graph edits for a purified causal/spurious view pair, and learns representations via a Causal-Informed Contrastive Learning objective. Theoretical analysis and extensive experiments on multiple datasets show that InvGCLLM outperforms state-of-the-art baselines under distribution shifts, with ablations confirming the contributions of LLM-based editing and invariant learning. The approach provides a practical pathway to robust, explainable OOD recommendations by combining graph structure refinement with principled causal learning.

Abstract

Out-of-distribution (OOD) generalization has emerged as a significant challenge in graph recommender systems. Traditional graph neural network algorithms often fail because they learn spurious environmental correlations instead of stable causal relationships, leading to substantial performance degradation under distribution shifts. While recent advancements in Large Language Models (LLMs) offer a promising avenue due to their vast world knowledge and reasoning capabilities, effectively integrating this knowledge with the fine-grained topology of specific graphs to solve the OOD problem remains a significant challenge. To address these issues, we propose {ariant raph ontrastive Learning with s for Out-of-Distribution Recommendation (InvGCLLM)}, an innovative causal learning framework that synergistically integrates the strengths of data-driven models and knowledge-driven LLMs. Our framework first employs a data-driven invariant learning model to generate causal confidence scores for each user-item interaction. These scores then guide an LLM to perform targeted graph refinement, leveraging its world knowledge to prune spurious connections and augment missing causal links. Finally, the structurally purified graphs provide robust supervision for a causality-guided contrastive learning objective, enabling the model to learn representations that are resilient to spurious correlations. Experiments conducted on four public datasets demonstrate that InvGCLLM achieves significant improvements in out-of-distribution recommendation, consistently outperforming state-of-the-art baselines.

Paper Structure

This paper contains 40 sections, 31 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of user preference shift over time, highlighting the OOD recommendation challenge as $u_1$'s favored movie genres change with age.
  • Figure 2: The InvGCLLM framework: Environment Extractor identifies invariant and variant features, Invariant Learning Module optimizes stable representations across environments, LLMGCL Module generates contrastive views, and Recommendation Module delivers robust out-of-distribution (OOD) recommendations.
  • Figure 3: The LLMC2GE workflow: The process starts with a prompt that leverages user history and MovieLens-1M data to predict interest, followed by top-$k$ and bottom-$k$ selection via an Edge Predictor, LoRA fine-tuning of the LLM, and graph structure refinement with added edges.
  • Figure 4: The specific prompt used in LLMC2GE, providing the LLM with user viewing history and candidate item descriptions from the MovieLens-1M dataset to predict user interest.
  • Figure 5: Ablation study results on four datasets (Douban, Movielens-1M, Yahoo, Amazon-Book), comparing InvGCLLM, InvGCN, GCLLM, and InvGCL in terms of NDCG@10, Precision@10, and Recall@10. InvGCLLM consistently achieves the best performance.
  • ...and 2 more figures