Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies

Ajwad Abrar; Nafisa Tabassum Oeshy; Mohsinul Kabir; Sophia Ananiadou

Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies

Ajwad Abrar, Nafisa Tabassum Oeshy, Mohsinul Kabir, Sophia Ananiadou

TL;DR

The paper probes religious bias in language and text-to-image models, highlighting persistent negative associations (notably with Islam) across masked prediction, prompt completion, and image generation. It introduces a cross-domain evaluation framework using 100 prompts per model for mask filling and prompt completion and 50 biased images per adjective for T2I tasks, plus a Religious Bias Score $RBS$ to quantify bias. Debiasing prompts—positive term augmentation and bias mitigation instructions—substantially reduce bias but do not eliminate it, with some open- and closed-source models showing near-zero bias after intervention. The study also uncovers cross-domain biases linking religion to nationality, gender, and age, emphasizing the need for more robust training data and systemic debiasing approaches beyond prompt engineering. Overall, the work provides a public artifact of prompts and images to advance fairer, globally acceptable AI systems.

Abstract

Note: This paper includes examples of potentially offensive content related to religious bias, presented solely for academic purposes. The widespread adoption of language models highlights the need for critical examinations of their inherent biases, particularly concerning religion. This study systematically investigates religious bias in both language models and text-to-image generation models, analyzing both open-source and closed-source systems. We construct approximately 400 unique, naturally occurring prompts to probe language models for religious bias across diverse tasks, including mask filling, prompt completion, and image generation. Our experiments reveal concerning instances of underlying stereotypes and biases associated disproportionately with certain religions. Additionally, we explore cross-domain biases, examining how religious bias intersects with demographic factors such as gender, age, and nationality. This study further evaluates the effectiveness of targeted debiasing techniques by employing corrective prompts designed to mitigate the identified biases. Our findings demonstrate that language models continue to exhibit significant biases in both text and image generation tasks, emphasizing the urgent need to develop fairer language models to achieve global acceptability.

Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies

TL;DR

to quantify bias. Debiasing prompts—positive term augmentation and bias mitigation instructions—substantially reduce bias but do not eliminate it, with some open- and closed-source models showing near-zero bias after intervention. The study also uncovers cross-domain biases linking religion to nationality, gender, and age, emphasizing the need for more robust training data and systemic debiasing approaches beyond prompt engineering. Overall, the work provides a public artifact of prompts and images to advance fairer, globally acceptable AI systems.

Abstract

Paper Structure (35 sections, 2 equations, 33 figures, 10 tables)

This paper contains 35 sections, 2 equations, 33 figures, 10 tables.

Introduction
Related Work
Models
Pre-trained Language Models
Open-source Large Language Models
Closed-source Large Language Models
Text-to-Image Generation
Methodology
Bias Detection
Mask Filling
Prompt Completion
Text-to-Image Generation
Debiasing Techniques
Positive Term Augmentation
Bias Mitigation Instructions
...and 20 more sections

Figures (33)

Figure 1: An example from RoBERTa showing anti-muslim bias.
Figure 2: An overview of the bias evaluation framework for large language models, illustrating two probing methods: Mask Filling and Prompt Completion. The framework demonstrates how models respond to queries combining negative actions and religion names, highlighting biased and unbiased outputs for bias measurement.
Figure 3: Proposed Methodology
Figure 4: Examples of percentage distributions with the sentence “The most violent religion in the world is [mask].” for various pre-trained language models.
Figure 5: An image generated by DALL-E 3 in response to the prompt "Religious terrorist".
...and 28 more figures

Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies

TL;DR

Abstract

Religious Bias Landscape in Language and Text-to-Image Models: Analysis, Detection, and Debiasing Strategies

Authors

TL;DR

Abstract

Table of Contents

Figures (33)