Table of Contents
Fetching ...

The Blessing of Reasoning: LLM-Based Contrastive Explanations in Black-Box Recommender Systems

Yuyan Wang, Pan Li, Minmin Chen

TL;DR

The paper challenges the conventional trade-off between explainability and predictive accuracy in recommender systems by introducing LR-Recsys, a framework that combines large language model (LLM) reasoning with deep neural networks (DNNs). It constructs a contrastive-explanation generator that outputs human-readable positive and negative explanations, embeds them via a fine-tuned AutoEncoder, and feeds these embeddings into a DNN alongside standard user, item, and context features. The authors provide statistical justification using high-dimensional multi-environment learning theory, arguing that LLMs possess robust knowledge of the key decision factors and that incorporating such knowledge improves learning efficiency and accuracy. Empirically, LR-Recsys achieves 3–14% improvements on three real-world datasets, with gains driven primarily by the LLMs’ reasoning capabilities rather than external domain knowledge or summarization, and ablations show that both positive and negative explanations are essential. The framework is compatible with any LLM, supports precomputation to mitigate latency, and yields actionable managerial insights by aggregating contrastive explanations for consumers, sellers, and platforms, highlighting broad business impact and practical deployment considerations.

Abstract

Modern recommender systems use ML models to predict consumer preferences from consumption history. Although these "black-box" models achieve impressive predictive performance, they often suffer from a lack of transparency and explainability. Contrary to the presumed tradeoff between explainability and accuracy, we show that integrating large language models (LLMs) with deep neural networks (DNNs) can improve both. We propose LR-Recsys, which augments DNN-based systems with LLM reasoning capabilities. LR-Recsys introduces a contrastive-explanation generator that produces human-readable positive explanations and negative explanations. These explanations are embedded via a fine-tuned autoencoder and combined with consumer and product features to improve predictions. Beyond offering explainability, we show that LR-Recsys also improves learning efficiency and predictive accuracy, as supported by high-dimensional, multi-environment statistical learning theory. LR-Recsys outperforms state-of-the-art recommender systems by 3-14% on three real-world datasets. Importantly, our analysis reveals that these gains primarily derive from LLMs' reasoning capabilities rather than their external domain knowledge. LR-RecSys presents an effective approach to combine LLMs with traditional DNNs, two of the most widely used ML models today. The explanations generated by LR-Recsys provide actionable insights for consumers, sellers, and platforms, helping to build trust, optimize product offerings, and inform targeting strategies.

The Blessing of Reasoning: LLM-Based Contrastive Explanations in Black-Box Recommender Systems

TL;DR

The paper challenges the conventional trade-off between explainability and predictive accuracy in recommender systems by introducing LR-Recsys, a framework that combines large language model (LLM) reasoning with deep neural networks (DNNs). It constructs a contrastive-explanation generator that outputs human-readable positive and negative explanations, embeds them via a fine-tuned AutoEncoder, and feeds these embeddings into a DNN alongside standard user, item, and context features. The authors provide statistical justification using high-dimensional multi-environment learning theory, arguing that LLMs possess robust knowledge of the key decision factors and that incorporating such knowledge improves learning efficiency and accuracy. Empirically, LR-Recsys achieves 3–14% improvements on three real-world datasets, with gains driven primarily by the LLMs’ reasoning capabilities rather than external domain knowledge or summarization, and ablations show that both positive and negative explanations are essential. The framework is compatible with any LLM, supports precomputation to mitigate latency, and yields actionable managerial insights by aggregating contrastive explanations for consumers, sellers, and platforms, highlighting broad business impact and practical deployment considerations.

Abstract

Modern recommender systems use ML models to predict consumer preferences from consumption history. Although these "black-box" models achieve impressive predictive performance, they often suffer from a lack of transparency and explainability. Contrary to the presumed tradeoff between explainability and accuracy, we show that integrating large language models (LLMs) with deep neural networks (DNNs) can improve both. We propose LR-Recsys, which augments DNN-based systems with LLM reasoning capabilities. LR-Recsys introduces a contrastive-explanation generator that produces human-readable positive explanations and negative explanations. These explanations are embedded via a fine-tuned autoencoder and combined with consumer and product features to improve predictions. Beyond offering explainability, we show that LR-Recsys also improves learning efficiency and predictive accuracy, as supported by high-dimensional, multi-environment statistical learning theory. LR-Recsys outperforms state-of-the-art recommender systems by 3-14% on three real-world datasets. Importantly, our analysis reveals that these gains primarily derive from LLMs' reasoning capabilities rather than their external domain knowledge. LR-RecSys presents an effective approach to combine LLMs with traditional DNNs, two of the most widely used ML models today. The explanations generated by LR-Recsys provide actionable insights for consumers, sellers, and platforms, helping to build trust, optimize product offerings, and inform targeting strategies.

Paper Structure

This paper contains 58 sections, 2 theorems, 33 equations, 10 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

Under conditions detailed in Appendix appen:lemma_1_cond, the multi-environment estimator $\hat{{\bm{\beta}}}_L$ has variable selection consistency, i.e. as long as $n, p, s^* \rightarrow \infty$ and $n \gg s^* \beta_{\text{min}}^{-2}\log{p}$, where $s^* = |S^*|$ and $\beta_{\text{min}} = \min_{j \in S^*} |\beta^*_j|$.

Figures (10)

  • Figure 1: Comparison between our proposed framework, LR-Recsys, and a typical recommender system.
  • Figure 2: A toy example for positive and negative explanations by GPT-4.
  • Figure 3: (Color online) Detailed architecture of LR-Recsys.
  • Figure 4: (Color online) AutoEncoder.
  • Figure 5: (Color online) Convergence rate comparison for the Lasso (unknown $S^*$) and Oracle estimator (known $S^*$).
  • ...and 5 more figures

Theorems & Definitions (2)

  • Lemma 1
  • Lemma 2