Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias
Jayanta Sadhu, Maneesha Rani Saha, Rifat Shahriyar
TL;DR
This study addresses social biases in Bangla LLMs with a focus on gender and religion, leveraging two probing paradigms (template-based and naturally sourced) and a curated bias benchmark to measure disparities. It evaluates four multilingual LLMs (Llama3-8b, GPT-3.5-Turbo, GPT-4o, Claude-3.5-Sonnet) under controlled prompting and short-generation settings, reporting bias via a Disparate Impact framework and a Bias Score defined as $\text{Bias Score} = \tanh(\log(C_x(a)/C_y(a)))$. Template-based probing reveals stronger biases and clearer directional patterns (e.g., gender biases toward females or males depending on model and trait), while naturally sourced probing yields more muted biases, suggesting context and guardrails influence bias manifestation. The work provides a publicly available dataset and code, highlights the need for Bangla-specific debiasing and fine-tuning, and outlines ethical considerations and limitations, including binary gender/religion framing and reproducibility challenges with closed models, guiding future research toward more inclusive and robust bias measurement in Bangla NLP.
Abstract
The rapid growth of Large Language Models (LLMs) has put forward the study of biases as a crucial field. It is important to assess the influence of different types of biases embedded in LLMs to ensure fair use in sensitive fields. Although there have been extensive works on bias assessment in English, such efforts are rare and scarce for a major language like Bangla. In this work, we examine two types of social biases in LLM generated outputs for Bangla language. Our main contributions in this work are: (1) bias studies on two different social biases for Bangla, (2) a curated dataset for bias measurement benchmarking and (3) testing two different probing techniques for bias detection in the context of Bangla. This is the first work of such kind involving bias assessment of LLMs for Bangla to the best of our knowledge. All our code and resources are publicly available for the progress of bias related research in Bangla NLP.
