Table of Contents
Fetching ...

Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations

Hari Shankar, Vedanta S P, Tejas Cavale, Ponnurangam Kumaraguru, Abhijnan Chakraborty

TL;DR

The paper tackles religious bias in open LLMs within non-Western contexts by proposing a demographic-profiling framework that maps model responses to survey-derived demographic vectors using One-Hot encoding and the Hamming distance $d_H$. It evaluates multiple open LLMs (e.g., Llama and Mistral) against Pew-style surveys from India and East/Southeast Asia, revealing that models often converge to a single homogeneous demographic profile, which raises concerns about hegemonic biases and minority representation. It further examines zero-shot steering prompts, finding limited effectiveness in altering the model's demographic alignment, and discusses data biases, alignment challenges, and potential mitigation strategies such as data augmentation or machine unlearning. The study provides an operational methodology for auditing LLMs’ social biases in diverse global contexts, with implications for safety, fairness, and future research directions in steering and data curation.

Abstract

Large Language Models (LLMs) are capable of generating opinions and propagating bias unknowingly, originating from unrepresentative and non-diverse data collection. Prior research has analysed these opinions with respect to the West, particularly the United States. However, insights thus produced may not be generalized in non-Western populations. With the widespread usage of LLM systems by users across several different walks of life, the cultural sensitivity of each generated output is of crucial interest. Our work proposes a novel method that quantitatively analyzes the opinions generated by LLMs, improving on previous work with regards to extracting the social demographics of the models. Our method measures the distance from an LLM's response to survey respondents, through Hamming Distance, to infer the demographic characteristics reflected in the model's outputs. We evaluate modern, open LLMs such as Llama and Mistral on surveys conducted in various global south countries, with a focus on India and other Asian nations, specifically assessing the model's performance on surveys related to religious tolerance and identity. Our analysis reveals that most open LLMs match a single homogeneous profile, varying across different countries/territories, which in turn raises questions about the risks of LLMs promoting a hegemonic worldview, and undermining perspectives of different minorities. Our framework may also be useful for future research investigating the complex intersection between training data, model architecture, and the resulting biases reflected in LLM outputs, particularly concerning sensitive topics like religious tolerance and identity.

Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations

TL;DR

The paper tackles religious bias in open LLMs within non-Western contexts by proposing a demographic-profiling framework that maps model responses to survey-derived demographic vectors using One-Hot encoding and the Hamming distance . It evaluates multiple open LLMs (e.g., Llama and Mistral) against Pew-style surveys from India and East/Southeast Asia, revealing that models often converge to a single homogeneous demographic profile, which raises concerns about hegemonic biases and minority representation. It further examines zero-shot steering prompts, finding limited effectiveness in altering the model's demographic alignment, and discusses data biases, alignment challenges, and potential mitigation strategies such as data augmentation or machine unlearning. The study provides an operational methodology for auditing LLMs’ social biases in diverse global contexts, with implications for safety, fairness, and future research directions in steering and data curation.

Abstract

Large Language Models (LLMs) are capable of generating opinions and propagating bias unknowingly, originating from unrepresentative and non-diverse data collection. Prior research has analysed these opinions with respect to the West, particularly the United States. However, insights thus produced may not be generalized in non-Western populations. With the widespread usage of LLM systems by users across several different walks of life, the cultural sensitivity of each generated output is of crucial interest. Our work proposes a novel method that quantitatively analyzes the opinions generated by LLMs, improving on previous work with regards to extracting the social demographics of the models. Our method measures the distance from an LLM's response to survey respondents, through Hamming Distance, to infer the demographic characteristics reflected in the model's outputs. We evaluate modern, open LLMs such as Llama and Mistral on surveys conducted in various global south countries, with a focus on India and other Asian nations, specifically assessing the model's performance on surveys related to religious tolerance and identity. Our analysis reveals that most open LLMs match a single homogeneous profile, varying across different countries/territories, which in turn raises questions about the risks of LLMs promoting a hegemonic worldview, and undermining perspectives of different minorities. Our framework may also be useful for future research investigating the complex intersection between training data, model architecture, and the resulting biases reflected in LLM outputs, particularly concerning sensitive topics like religious tolerance and identity.

Paper Structure

This paper contains 36 sections, 2 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Summarizing Model Profiles. Model Profiles are specific to the surveys used to create them. Colour code for countries: Purple - India. Blue - Countries from Southeast Asia Survey. Red - Countries from East Asia Survey.
  • Figure 2: Sample Survey Questions from the Indian Survey Dataset. The left table presents opinion-based questions focused on family responsibilities and inheritance rights. The right table includes few demographic questions, such as education level and household income.
  • Figure 3: System Overview for Model Profiling. Distance is computed by representing both Model and Survey Response as a Vector, following which the Hamming Distance metric is calculated between the two vectors.
  • Figure 4: Detailing the different steering prompts used in our experiments.
  • Figure 5: Model Profiles - India. Comparing Model Responses with top 1,000 closest Survey Respondents. LLMs primarily converge on a single demographic profile (e.g., married Hindu males, 35-44, rural Northern India, low income, high school), with minor regional or educational variations in some models.
  • ...and 2 more figures