Evaluating Social Bias in RAG Systems: When External Context Helps and Reasoning Hurts

Shweta Parihar; Lu Cheng

Evaluating Social Bias in RAG Systems: When External Context Helps and Reasoning Hurts

Shweta Parihar, Lu Cheng

TL;DR

This work systematically evaluates social bias in Retrieval-Augmented Generation (RAG) across more than 13 bias types using two retrieval corpora (WikiText-103 and C4) and multiple bias benchmarks (SCW, BOLD, HolisticBias). It shows that standard RAG generally reduces bias by diversifying contextual grounding, while introducing Chain-of-Thought (CoT) prompting within RAG paradoxically increases bias, revealing a fairness-accuracy tradeoff. A faithfulness analysis indicates CoT explanations are largely grounded in retrieved evidence, yet more explicit reasoning can amplify biased associations. The findings highlight the need for bias-aware reasoning frameworks and carefully curated external context to harness RAG's fairness benefits without amplifying bias in the reasoning process.

Abstract

Social biases inherent in large language models (LLMs) raise significant fairness concerns. Retrieval-Augmented Generation (RAG) architectures, which retrieve external knowledge sources to enhance the generative capabilities of LLMs, remain susceptible to the same bias-related challenges. This work focuses on evaluating and understanding the social bias implications of RAG. Through extensive experiments across various retrieval corpora, LLMs, and bias evaluation datasets, encompassing more than 13 different bias types, we surprisingly observe a reduction in bias in RAG. This suggests that the inclusion of external context can help counteract stereotype-driven predictions, potentially improving fairness by diversifying the contextual grounding of the model's outputs. To better understand this phenomenon, we then explore the model's reasoning process by integrating Chain-of-Thought (CoT) prompting into RAG while assessing the faithfulness of the model's CoT. Our experiments reveal that the model's bias inclinations shift between stereotype and anti-stereotype responses as more contextual information is incorporated from the retrieved documents. Interestingly, we find that while CoT enhances accuracy, contrary to the bias reduction observed with RAG, it increases overall bias across datasets, highlighting the need for bias-aware reasoning frameworks that can mitigate this trade-off.

Evaluating Social Bias in RAG Systems: When External Context Helps and Reasoning Hurts

TL;DR

Abstract

Paper Structure (19 sections, 3 equations, 9 figures, 6 tables)

This paper contains 19 sections, 3 equations, 9 figures, 6 tables.

Introduction
Related works
Experimental design
Datasets, Metrics and RAG Pipeline Implementation
Bias Evaluation Methodology
Chain of Thought (CoT) Analysis on RAG
Experimental results
RQ 1 - Bias Evaluation in RAG
RQ2 - Bias after RAG with CoT
RQ 3 - Faithfulness of CoT on RAG Bias
Conclusion
Disclosure of Interests.
Bias evaluation datasets and metrics
RAG pipeline implementation
Retrieval Datasets
...and 4 more sections

Figures (9)

Figure 1: Experimental design for bias evaluation and understanding in RAG. Step 1: Calculate bias before RAG. Step 2: Calculate bias after RAG and compare with before RAG. Step 3: Implement CoT with RAG and understand model's reasoning, check bias again and compare. Step 4: Implement faithfulness evaluation of RAG's CoT explanations to determine if it is faithful to the retrieved context.
Figure 2: Pearson correlations between Bias and Evaluation Metric scores across different prompting variants.
Figure 3: Bias inclination of model Meta-Llama-3-8B-Ins at 4 partial prompt checkpoints: after giving 1 sentence, 25%, 50%, 70% of Full CoT explanations 1-15.
Figure 4: Prompt template before RAG for SCW dataset
Figure 5: Prompt template after standard RAG for SCW dataset
...and 4 more figures

Evaluating Social Bias in RAG Systems: When External Context Helps and Reasoning Hurts

TL;DR

Abstract

Evaluating Social Bias in RAG Systems: When External Context Helps and Reasoning Hurts

Authors

TL;DR

Abstract

Table of Contents

Figures (9)