Table of Contents
Fetching ...

ExplainRec: Towards Explainable Multi-Modal Zero-Shot Recommendation with Preference Attribution and Large Language Models

Bo Ma, LuYao Liu, ZeHua Hu, Simon Lau

TL;DR

ExplainRec tackles the limitations of prior LLM-based recommender systems by integrating explainable preference attribution, zero-shot transfer, multi-modal fusion, and multi-task optimization within a unified framework. It introduces attribution-enhanced instructions, a multi-objective preference loss, a universal preference knowledge base, and a joint training scheme to share learning signals across tasks. Empirical results on MovieLens-25M, Amazon Movies & TV, and cross-domain data show consistent improvements (0.7%–0.9% AUC gains) over strong baselines, along with interpretable explanations and robust cold-start performance. The approach demonstrates the value of combining explainability, cross-modal content, and cross-task collaboration for practical, scalable LLM-based recommendations with strong transfer capabilities.

Abstract

Recent advances in Large Language Models (LLMs) have opened new possibilities for recommendation systems, though current approaches such as TALLRec face challenges in explainability and cold-start scenarios. We present ExplainRec, a framework that extends LLM-based recommendation capabilities through preference attribution, multi-modal fusion, and zero-shot transfer learning. The framework incorporates four technical contributions: preference attribution tuning for explainable recommendations, zero-shot preference transfer for cold-start users and items, multi-modal enhancement leveraging visual and textual content, and multi-task collaborative optimization. Experimental evaluation on MovieLens-25M and Amazon datasets shows that ExplainRec outperforms existing methods, achieving AUC improvements of 0.7\% on movie recommendation and 0.9\% on cross-domain tasks, while generating interpretable explanations and handling cold-start scenarios effectively.

ExplainRec: Towards Explainable Multi-Modal Zero-Shot Recommendation with Preference Attribution and Large Language Models

TL;DR

ExplainRec tackles the limitations of prior LLM-based recommender systems by integrating explainable preference attribution, zero-shot transfer, multi-modal fusion, and multi-task optimization within a unified framework. It introduces attribution-enhanced instructions, a multi-objective preference loss, a universal preference knowledge base, and a joint training scheme to share learning signals across tasks. Empirical results on MovieLens-25M, Amazon Movies & TV, and cross-domain data show consistent improvements (0.7%–0.9% AUC gains) over strong baselines, along with interpretable explanations and robust cold-start performance. The approach demonstrates the value of combining explainability, cross-modal content, and cross-task collaboration for practical, scalable LLM-based recommendations with strong transfer capabilities.

Abstract

Recent advances in Large Language Models (LLMs) have opened new possibilities for recommendation systems, though current approaches such as TALLRec face challenges in explainability and cold-start scenarios. We present ExplainRec, a framework that extends LLM-based recommendation capabilities through preference attribution, multi-modal fusion, and zero-shot transfer learning. The framework incorporates four technical contributions: preference attribution tuning for explainable recommendations, zero-shot preference transfer for cold-start users and items, multi-modal enhancement leveraging visual and textual content, and multi-task collaborative optimization. Experimental evaluation on MovieLens-25M and Amazon datasets shows that ExplainRec outperforms existing methods, achieving AUC improvements of 0.7\% on movie recommendation and 0.9\% on cross-domain tasks, while generating interpretable explanations and handling cold-start scenarios effectively.

Paper Structure

This paper contains 34 sections, 11 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Overview of the ExplainRec framework. The framework extends TALLRec with four key components: (1) Multi-modal enhancement for processing text and visual information, (2) Preference attribution module for generating explanations, (3) Zero-shot transfer component for cold-start scenarios, and (4) Multi-task optimization for joint training. The framework maintains the two-stage training paradigm while adding explainability and multi-modal capabilities.
  • Figure 2: Performance comparison across different methods on MovieLens-25M and Amazon Movies datasets. ExplainRec consistently outperforms all baseline methods across AUC and NDCG@10 metrics. The improvements are particularly significant compared to traditional methods (LightGCN, SASRec) and competitive with recent LLM-based approaches (TALLRec, Chat-REC, P5). Multi-modal methods (MMGCN, LATTICE) show moderate improvements, but ExplainRec's comprehensive approach achieves the best overall performance.
  • Figure 3: Ablation study and cold-start performance analysis. (Left) Component contribution analysis showing that preference attribution provides the largest performance gain while dramatically improving explanation quality. (Right) Cold-start performance comparison demonstrating ExplainRec's superior zero-shot transfer capabilities across new users, new items, and cross-domain scenarios. The universal preference knowledge base enables effective recommendations even without historical interactions.