Table of Contents
Fetching ...

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

Hari Shankar, Vedanta S P, Sriharini Margapuri, Debjani Mazumder, Ponnurangam Kumaraguru, Abhijnan Chakraborty

Abstract

Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present a comprehensive, multilingual audit of the cultural alignment of contemporary LLMs including GPT-4o-Mini, Gemini-2.5-Flash, Llama 3.2, Mistral and Gemma 3 across India, East Asia and Southeast Asia. Our study specifically focuses on the sensitive domain of religion as the prism for broader alignment. To facilitate this, we conduct a multi-faceted analysis of every LLM's internal representations, using log-probs/logits, to compare the model's opinion distributions against ground-truth public attitudes. We find that while the popular models generally align with public opinion on broad social issues, they consistently fail to accurately represent religious viewpoints, especially those of minority groups, often amplifying negative stereotypes. Lightweight interventions, such as demographic priming and native language prompting, partially mitigate but do not eliminate these cultural gaps. We further show that downstream evaluations on bias benchmarks (such as CrowS-Pairs, IndiBias, ThaiCLI, KoBBQ) reveal persistent harms and under-representation in sensitive contexts. Our findings underscore the urgent need for systematic, regionally grounded audits to ensure equitable global deployment of LLMs.

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

Abstract

Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present a comprehensive, multilingual audit of the cultural alignment of contemporary LLMs including GPT-4o-Mini, Gemini-2.5-Flash, Llama 3.2, Mistral and Gemma 3 across India, East Asia and Southeast Asia. Our study specifically focuses on the sensitive domain of religion as the prism for broader alignment. To facilitate this, we conduct a multi-faceted analysis of every LLM's internal representations, using log-probs/logits, to compare the model's opinion distributions against ground-truth public attitudes. We find that while the popular models generally align with public opinion on broad social issues, they consistently fail to accurately represent religious viewpoints, especially those of minority groups, often amplifying negative stereotypes. Lightweight interventions, such as demographic priming and native language prompting, partially mitigate but do not eliminate these cultural gaps. We further show that downstream evaluations on bias benchmarks (such as CrowS-Pairs, IndiBias, ThaiCLI, KoBBQ) reveal persistent harms and under-representation in sensitive contexts. Our findings underscore the urgent need for systematic, regionally grounded audits to ensure equitable global deployment of LLMs.
Paper Structure (27 sections, 3 equations, 9 figures, 3 tables)

This paper contains 27 sections, 3 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Evaluation framework for assessing LLMs, where human opinion distributions from Pew surveys (India, Sri Lanka, East Asia, and South East Asia) are compared with model-generated distributions to measure representativeness across various categories.
  • Figure 2: The treemap shows respondent counts from 12 countries/territories across India (green), East Asia (blue), and Southeast Asia (orange), with block area proportional to sample size. These nationally representative surveys (Pew Research Center 2021, 2024a, 2024b) form the empirical ground truth for measuring LLM representativeness, enabling robust cross-country comparisons on religion and social attitudes.
  • Figure 3: Qualitative examples for two religion-related bias benchmarks. CrowS-Pairs operationalizes bias via minimal pairs scored with pseudo-log-likelihood, while ThaiCLI uses instruction-preference judgments with explicit chosen vs. rejected responses.
  • Figure 4: Representativeness scores ($\mathcal{R}_\mathcal{M}$) of GPT-4o-Mini and Gemini-2.5-Flash on non-religious versus religious items. While both models achieve high representativeness on non-religious prompts ($>$94%), their scores dip on religious items.
  • Figure 5: Change in Hellinger distance ($\Delta H = H_{\text{local}} - H_{\text{EN}}$) when switching from English to local-language prompts for Gemma-3-12B across multiple locales. Negative values (bars below zero) indicate that local-language prompting reduces the divergence between model and human distributions.
  • ...and 4 more figures