ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning
Wonduk Seo, Zonghao Yuan, Yi Bu
TL;DR
ValuesRAG tackles cultural values alignment in LLMs by dynamically retrieving and integrating demographic-aware value summaries from the World Values Survey using a retrieval-augmented generation pipeline. It combines in-context learning with a two-stage retrieval and reranking process to produce contextually rich, culturally aligned responses, outperforming zero-shot, role-assignment, few-shot, and hybrid baselines across six regional datasets. An ablation study identifies an optimal retrieval depth (k=3) and demonstrates robustness of a values-only generation variant, highlighting the method's scalability and fairness in diverse cultural settings. The approach offers practical potential for policy analysis, public-facing AI, and NGO applications by bridging global LLM capabilities with localized cultural values while acknowledging ethical considerations around demographic profiling.
Abstract
Ensuring cultural values alignment in Large Language Models (LLMs) remains a critical challenge, as these models often embed Western-centric biases from their training data, leading to misrepresentations and fairness concerns in cross-cultural applications. Existing approaches such as role assignment and few-shot learning struggle to address these limitations effectively due to their reliance on pre-trained knowledge, limited scalability, and inability to capture nuanced cultural values. To address these issues, we propose ValuesRAG, a novel and effective framework that applies Retrieval-Augmented Generation (RAG) with In-Context Learning (ICL) to integrate cultural and demographic knowledge dynamically during text generation. Leveraging the World Values Survey (WVS) dataset, ValuesRAG first generates summaries of values for each individual. We subsequently curate several representative regional datasets to serve as test datasets and retrieve relevant summaries of values based on demographic features, followed by a reranking step to select the top-k relevant summaries. We evaluate ValuesRAG using 6 diverse regional datasets and show that it consistently outperforms baselines: including zero-shot, role-assignment, few-shot, and hybrid methods, both in main experiments and ablation settings. Notably, ValuesRAG achieves the best overall performance over prior methods, demonstrating its effectiveness in fostering culturally aligned and inclusive AI systems. Our findings underscore the potential of dynamic retrieval-based methods to bridge the gap between global LLM capabilities and localized cultural values.
