Large Language Models as Conversational Movie Recommenders: A User Study

Ruixuan Sun; Xinyi Li; Avinash Akella; Joseph A. Konstan

Large Language Models as Conversational Movie Recommenders: A User Study

Ruixuan Sun, Xinyi Li, Avinash Akella, Joseph A. Konstan

TL;DR

This paper investigates open-source Large Language Models as conversational movie recommenders through an online field study with 160 active users. It compares zero-shot, one-shot, and few-shot prompts across three scenarios to assess perceived quality, finding that LLMs excel at explainability and interaction but struggle with personalization, diversity, and trust, though they better surface niche recommendations. By analyzing both quantitative survey data and qualitative conversation patterns, the study shows that providing personal context and examples enhances recommendation quality, while longer dialogues can reduce satisfaction and that prompts alone do not substantially change outcomes. The work suggests design directions such as retrieval-augmented generation, grounding, multimodal data, and proactive user guidance to improve LLM-based recommender experiences and outlines actionable learnings for researchers and practitioners.

Abstract

This paper explores the effectiveness of using large language models (LLMs) for personalized movie recommendations from users' perspectives in an online field experiment. Our study involves a combination of between-subject prompt and historic consumption assessments, along with within-subject recommendation scenario evaluations. By examining conversation and survey response data from 160 active users, we find that LLMs offer strong recommendation explainability but lack overall personalization, diversity, and user trust. Our results also indicate that different personalized prompting techniques do not significantly affect user-perceived recommendation quality, but the number of movies a user has watched plays a more significant role. Furthermore, LLMs show a greater ability to recommend lesser-known or niche movies. Through qualitative analysis, we identify key conversational patterns linked to positive and negative user interaction experiences and conclude that providing personal context and examples is crucial for obtaining high-quality recommendations from LLMs.

Large Language Models as Conversational Movie Recommenders: A User Study

TL;DR

Abstract

Paper Structure (22 sections, 8 figures, 7 tables)

This paper contains 22 sections, 8 figures, 7 tables.

Introduction
Related Work
Large Language Models in Recommendation
RecSys User Experience
Study Design
Participants and Personalized Prompts
User Interface and Scenarios
Evaluation Metrics
Results
User-assessed LLMRec Quality
Defect of Recommendation Quality
Interactivity, Explainability, and Context
Control, Trust, and Transparency
Effect of Prompt, Scenario, and Rating
Useful Conversation Strategies
...and 7 more sections

Figures (8)

Figure 1: Overview of the LLM recommender user study design.
Figure 2: Diagram depicting the general rule and randomly selected prompting technique of the LLM for engaging with users at the 1st phrase of the user study.
Figure 3: Three recommendation scenarios.
Figure 4: Survey Question as evaluation metrics.
Figure 5: User perception of recommendation quality from LLMRec compared to the classic MovieRec experience. All LLMRec-MovieRec pairs are tested with paired t-test and shows p <= 0.05 statistical significance except for the Birthday-Explainability and Long-Trip-Explainability.
...and 3 more figures

Large Language Models as Conversational Movie Recommenders: A User Study

TL;DR

Abstract

Large Language Models as Conversational Movie Recommenders: A User Study

Authors

TL;DR

Abstract

Table of Contents

Figures (8)