Improving RAG for Personalization with Author Features and Contrastive Examples
Mert Yazan, Suzan Verberne, Frederik Situmeang
TL;DR
Personalization in retrieval-augmented generation (RAG) often misses fine-grained author traits. The paper proposes enriching the LLM prompt with author features and Contrastive Examples (CE) to emphasize what makes an author's style unique, achieving up to a 15% relative improvement over baselines. The approach uses LaMP datasets with Contriever as the retriever and shows that combining CE with author features yields strong gains, particularly for LaMP-7, while not adding computational overhead. This work introduces a new paradigm for RAG where contrastive context complements retrieved samples, enabling more precise, author-aware generation and opening avenues for further IR research on CE retrieval.
Abstract
Personalization with retrieval-augmented generation (RAG) often fails to capture fine-grained features of authors, making it hard to identify their unique traits. To enrich the RAG context, we propose providing Large Language Models (LLMs) with author-specific features, such as average sentiment polarity and frequently used words, in addition to past samples from the author's profile. We introduce a new feature called Contrastive Examples: documents from other authors are retrieved to help LLM identify what makes an author's style unique in comparison to others. Our experiments show that adding a couple of sentences about the named entities, dependency patterns, and words a person uses frequently significantly improves personalized text generation. Combining features with contrastive examples boosts the performance further, achieving a relative 15% improvement over baseline RAG while outperforming the benchmarks. Our results show the value of fine-grained features for better personalization, while opening a new research dimension for including contrastive examples as a complement with RAG. We release our code publicly.
