Sociodemographic Bias in Language Models: A Survey and Forward Path
Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, Rebecca J. Passonneau
TL;DR
This survey comprehensively maps a decade of sociodemographic bias research in NLP by framing the field with a three-pronged typology: (i) types of bias, (ii) methods to quantify bias, and (iii) debiasing strategies applied at finetuning, training, and inference. It evaluates intrinsic and extrinsic measurement approaches, highlights foundational datasets such as WEAT, CrowS-Pairs, and HolisticBias, and discusses the limitations and reliability of current metrics. The authors propose a 13-question checklist to guide robust, interdisciplinary, and reproducible bias research, and they identify training-time mitigation and expert-model techniques as promising directions. The work emphasizes practical impact, language diversity, and sociotechnical considerations, calling for broader collaboration beyond NLP to curb harms and improve equitable deployment of language technologies.
Abstract
Sociodemographic bias in language models (LMs) has the potential for harm when deployed in real-world settings. This paper presents a comprehensive survey of the past decade of research on sociodemographic bias in LMs, organized into a typology that facilitates examining the different aims: types of bias, quantifying bias, and debiasing techniques. We track the evolution of the latter two questions, then identify current trends and their limitations, as well as emerging techniques. To guide future research towards more effective and reliable solutions, and to help authors situate their work within this broad landscape, we conclude with a checklist of open questions.
