The Oracle and The Prism: A Decoupled and Efficient Framework for Generative Recommendation Explanation
Jiaheng Zhang, Daqiang Zhang
TL;DR
The paper addresses the trade-off between recommendation accuracy and explanation quality by introducing Prism, a fully decoupled framework that uses a faithfulness-constrained knowledge-distillation pipeline to train a compact explanation generator from a large teacher LLM. By separating ranking from explanation generation, Prism achieves strong human-perceived faithfulness and personalization while delivering substantial efficiency gains (over 24x faster, ~10x less memory) and plug-and-play compatibility with any ranking module. It demonstrates that a compact student can surpass its large teacher on human judgments and even mitigate teacher hallucinations, revealing a knowledge refinement effect from distillation. The work highlights the limitations of automatic metrics like ROUGE-L for this task and provides a practical, trustworthy pathway for deploying explainable recommendations in real-world systems.
Abstract
The integration of Large Language Models (LLMs) into explainable recommendation systems often leads to a performance-efficiency trade-off in end-to-end architectures, where joint optimization of ranking and explanation can result in suboptimal compromises. To resolve this, we propose Prism, a novel decoupled framework that rigorously separates the recommendation process into a dedicated ranking stage and an explanation generation stage. This decomposition ensures that each component is optimized for its specific objective, eliminating inherent conflicts in coupled models. Inspired by knowledge distillation, Prism leverages a powerful, instruction-following teacher LLM (FLAN-T5-XXL) as an Oracle to produce high-fidelity explanatory knowledge. A compact, fine-tuned student model (BART-Base), the Prism, then specializes in synthesizing this knowledge into personalized explanations. Our extensive experiments on benchmark datasets reveal a key finding: the distillation process not only transfers knowledge but also acts as a noise filter. Our 140M-parameter Prism model significantly outperforms its 11B-parameter teacher in human evaluations of faithfulness and personalization, demonstrating an emergent ability to correct hallucinations present in the teacher's outputs. While achieving a 24x speedup and a 10x reduction in memory consumption, our analysis validates that decoupling, coupled with targeted distillation, provides an efficient and effective pathway to high-quality, and perhaps more importantly, trustworthy explainable recommendation.
