Mitigating Propensity Bias of Large Language Models for Recommender Systems

Guixian Zhang; Guan Yuan; Debo Cheng; Lin Liu; Jiuyong Li; Shichao Zhang

Mitigating Propensity Bias of Large Language Models for Recommender Systems

Guixian Zhang, Guan Yuan, Debo Cheng, Lin Liu, Jiuyong Li, Shichao Zhang

TL;DR

This work tackles the propensity bias and dimensional collapse that arise when incorporating LLM-generated side information into recommender systems. It introduces Counterfactual LLM Recommendation (CLLMR), which combines a spectrum-based Side Information Encoder (SSE) with counterfactual inference to debias the alignment between side information and collaborative signals. SSE uses an identifiable VAE with a spectrum-derived latent prior and controlled noise to capture structural information from historical interactions, preventing collapse across representation dimensions. Through causal modeling and counterfactual reasoning, CLLMR mitigates LLM biases during inference while maintaining the rich knowledge encoded by LLMs, achieving robust improvements across multiple backbone recommenders and real-world datasets. The approach couples contrastive alignment with causal debiasing, delivering practical gains for LLM-enhanced recommender systems with improved fairness and personalization.

Abstract

The rapid development of Large Language Models (LLMs) creates new opportunities for recommender systems, especially by exploiting the side information (e.g., descriptions and analyses of items) generated by these models. However, aligning this side information with collaborative information from historical interactions poses significant challenges. The inherent biases within LLMs can skew recommendations, resulting in distorted and potentially unfair user experiences. On the other hand, propensity bias causes side information to be aligned in such a way that it often tends to represent all inputs in a low-dimensional subspace, leading to a phenomenon known as dimensional collapse, which severely restricts the recommender system's ability to capture user preferences and behaviours. To address these issues, we introduce a novel framework named Counterfactual LLM Recommendation (CLLMR). Specifically, we propose a spectrum-based side information encoder that implicitly embeds structural information from historical interactions into the side information representation, thereby circumventing the risk of dimension collapse. Furthermore, our CLLMR approach explores the causal relationships inherent in LLM-based recommender systems. By leveraging counterfactual inference, we counteract the biases introduced by LLMs. Extensive experiments demonstrate that our CLLMR approach consistently enhances the performance of various recommender models.

Mitigating Propensity Bias of Large Language Models for Recommender Systems

TL;DR

Abstract

Paper Structure (29 sections, 22 equations, 11 figures, 5 tables)

This paper contains 29 sections, 22 equations, 11 figures, 5 tables.

Introduction
Related Work
Bias in Artificial Intelligence
Recommender Systems with Side Information
Preliminary
Problem Setting
Graphical Causal Model
Causal Effect and Counterfactual Inference
The Proposed CLLMR Framework
Constructing Side Information
Spectrum-based Side Information Encoder
Counterfactual Debiased Recommendation
Training and Inference
Experimental evaluation
Experiment Settings
...and 14 more sections

Figures (11)

Figure 1: An example illustrating the introduction of propensity bias when using LLMs to generate profiles and reason about user 5596 in the Amazon dataset.
Figure 2: The singular values of the trained representations show that directly aligning the collaborative information and the side information of LLMs often leads to dimensional collapse, whereby the singular values of each dimension are very low and similar, losing the distinction between different elements. However, our proposed SSE method effectively mitigates this issue and guarantees the quality of the representation.
Figure 3: Example of causal DAG where $X$, $Y$, and $M$ denote exposure, outcome and mediator variables, respectively. Gray nodes indicate that the variables are set to reference values (e.g., $X$ = $x^{*}$), the half shaded node represents the result affected by the reference value. In this figure, (a) represents the causal graph for the factual scenario, (b) shows the causal graph where X is set to the reference state, and (c) illustrates the causal graph for the counterfactual scenario, where X is set to different values.
Figure 4: An overview of our proposed CLLMR framework for training LLM-based recommender systems and performing counterfactual inference using causal graphs to eliminate LLM propensity bias during the recommendation inference stage.
Figure 5: Figure (a) represents the causal map of the factual world, and Figure (b) represents the causal map of the counterfactual world. Gray nodes represent reference states, and half-shaded nodes represent that they are influenced by the reference state.
...and 6 more figures

Mitigating Propensity Bias of Large Language Models for Recommender Systems

TL;DR

Abstract

Mitigating Propensity Bias of Large Language Models for Recommender Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (11)