Table of Contents
Fetching ...

DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System

Xihong Yang, Heming Jing, Zixing Zhang, Jindong Wang, Huakang Niu, Shuaiqiang Wang, Yu Lu, Junfeng Wang, Dawei Yin, Xinwang Liu, En Zhu, Defu Lian, Erxue Min

TL;DR

This work proves that directly aligning the representations of LLMs and collaborative models is suboptimal for enhancing downstream recommendation tasks performance, based on the information theorem, and proposes a novel plug-and-play alignment framework for LLMs and collaborative models.

Abstract

Benefiting from the strong reasoning capabilities, Large language models (LLMs) have demonstrated remarkable performance in recommender systems. Various efforts have been made to distill knowledge from LLMs to enhance collaborative models, employing techniques like contrastive learning for representation alignment. In this work, we prove that directly aligning the representations of LLMs and collaborative models is sub-optimal for enhancing downstream recommendation tasks performance, based on the information theorem. Consequently, the challenge of effectively aligning semantic representations between collaborative models and LLMs remains unresolved. Inspired by this viewpoint, we propose a novel plug-and-play alignment framework for LLMs and collaborative models. Specifically, we first disentangle the latent representations of both LLMs and collaborative models into specific and shared components via projection layers and representation regularization. Subsequently, we perform both global and local structure alignment on the shared representations to facilitate knowledge transfer. Additionally, we theoretically prove that the specific and shared representations contain more pertinent and less irrelevant information, which can enhance the effectiveness of downstream recommendation tasks. Extensive experimental results on benchmark datasets demonstrate that our method is superior to existing state-of-the-art algorithms.

DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System

TL;DR

This work proves that directly aligning the representations of LLMs and collaborative models is suboptimal for enhancing downstream recommendation tasks performance, based on the information theorem, and proposes a novel plug-and-play alignment framework for LLMs and collaborative models.

Abstract

Benefiting from the strong reasoning capabilities, Large language models (LLMs) have demonstrated remarkable performance in recommender systems. Various efforts have been made to distill knowledge from LLMs to enhance collaborative models, employing techniques like contrastive learning for representation alignment. In this work, we prove that directly aligning the representations of LLMs and collaborative models is sub-optimal for enhancing downstream recommendation tasks performance, based on the information theorem. Consequently, the challenge of effectively aligning semantic representations between collaborative models and LLMs remains unresolved. Inspired by this viewpoint, we propose a novel plug-and-play alignment framework for LLMs and collaborative models. Specifically, we first disentangle the latent representations of both LLMs and collaborative models into specific and shared components via projection layers and representation regularization. Subsequently, we perform both global and local structure alignment on the shared representations to facilitate knowledge transfer. Additionally, we theoretically prove that the specific and shared representations contain more pertinent and less irrelevant information, which can enhance the effectiveness of downstream recommendation tasks. Extensive experimental results on benchmark datasets demonstrate that our method is superior to existing state-of-the-art algorithms.
Paper Structure (26 sections, 4 theorems, 32 equations, 8 figures, 4 tables, 1 algorithm)

This paper contains 26 sections, 4 theorems, 32 equations, 8 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

For collaborative models encoder network $f_{\textbf{C}(\cdot)}$ and LLMs encoder network $f_{\textbf{L}(\cdot)}$, if the representations $\textbf{E}^{\textbf{C}}= f_{\textbf{C}}(\textbf{D})$ and $\textbf{E}^{\textbf{L}}= f_{\textbf{L}}(\textbf{D'})$ are exactly aligned in the latent space, i.e., $\

Figures (8)

  • Figure 1: Illustration of the information gap between LLMs and collaborative models. The noisy signals within the specific information of each aspect impede the alignment of shared information, leading to a decline in the quality of representation.
  • Figure 2: Illustration of our proposed disentangled alignment strategy. In our method, we first disentangle the representation into shared and specific components with two exclusive encoders and introduce orthogonal and uniformity loss to guarantee informative representations. Then, based on the shared representation, we devise a structure alignment strategy at both global and local levels to enhance the transfer of semantic knowledge from LLMs to collaborative models.
  • Figure 3: Ablation studies of our proposed method with four baselines in three datasets. The first row, the second row, the third row and the fouth row correspond with Recall@5, Recall@10, NDCG@5, NDCG@10 Metric, respectively.
  • Figure 4: Sensitive analysis with four baselines in three datasets for hyper-parameter $K$.
  • Figure 5: Sensitive analysis with four baselines in three datasets for hyper-parameter trade-off parameter $\lambda$, respectively.
  • ...and 3 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2
  • proof
  • Lemma 1
  • Lemma 2
  • proof
  • proof