Table of Contents
Fetching ...

Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders

Wei-Hsiang Huang, Chen-Wei Ke, Wei-Ning Chiu, Yu-Xuan Su, Chun-Chun Yang, Chieh-Yuan Cheng, Yun-Nung Chen, Pu-Jen Cheng

TL;DR

The paper tackles how to categorize and fairly compare LLM-based recommender systems by dividing them into Pure and Augmented LLM Recommenders. It proposes a two-branch taxonomy and a unified benchmarking pipeline on the Amazon_23 dataset to isolate design choices such as identifier usage, grounding, and collaborative signals. Key findings show Augmented approaches, especially those using Semantic Identifiers and collaborative modalities, generally outperform Pure LLMs and traditional baselines in sequential recommendation tasks. The study also highlights challenges like distribution gaps between language semantics and recommendation signals, echo chamber effects, and position bias, and outlines future directions on cold-start and cross-domain generalization.

Abstract

Large language models (LLMs) have introduced new paradigms for recommender systems by enabling richer semantic understanding and incorporating implicit world knowledge. In this study, we propose a systematic taxonomy that classifies existing approaches into two categories: (1) Pure LLM Recommenders, which rely solely on LLMs, and (2) Augmented LLM Recommenders, which integrate additional non-LLM techniques to enhance performance. This taxonomy provides a novel lens through which to examine the evolving landscape of LLM-based recommendation. To support fair comparison, we introduce a unified evaluation platform that benchmarks representative models under consistent experimental settings, highlighting key design choices that impact effectiveness. We conclude by discussing open challenges and outlining promising directions for future research. This work offers both a comprehensive overview and practical guidance for advancing next-generation LLM-powered recommender.

Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders

TL;DR

The paper tackles how to categorize and fairly compare LLM-based recommender systems by dividing them into Pure and Augmented LLM Recommenders. It proposes a two-branch taxonomy and a unified benchmarking pipeline on the Amazon_23 dataset to isolate design choices such as identifier usage, grounding, and collaborative signals. Key findings show Augmented approaches, especially those using Semantic Identifiers and collaborative modalities, generally outperform Pure LLMs and traditional baselines in sequential recommendation tasks. The study also highlights challenges like distribution gaps between language semantics and recommendation signals, echo chamber effects, and position bias, and outlines future directions on cold-start and cross-domain generalization.

Abstract

Large language models (LLMs) have introduced new paradigms for recommender systems by enabling richer semantic understanding and incorporating implicit world knowledge. In this study, we propose a systematic taxonomy that classifies existing approaches into two categories: (1) Pure LLM Recommenders, which rely solely on LLMs, and (2) Augmented LLM Recommenders, which integrate additional non-LLM techniques to enhance performance. This taxonomy provides a novel lens through which to examine the evolving landscape of LLM-based recommendation. To support fair comparison, we introduce a unified evaluation platform that benchmarks representative models under consistent experimental settings, highlighting key design choices that impact effectiveness. We conclude by discussing open challenges and outlining promising directions for future research. This work offers both a comprehensive overview and practical guidance for advancing next-generation LLM-powered recommender.

Paper Structure

This paper contains 37 sections, 3 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: An illustration of the taxonomy. LLM Recommenders can be categorized into Pure (up) and Augmented (down) LLM Recommenders, depending on whether they utilize non-LLM techniques to help the final decision making of LLMs.
  • Figure 2: Illustration of the challenges in LLM Recommenders--- Distribution Gap between Recommendation and Language Semantics, Echo Chamber Effects, and Position Bias.