Social Biases in NLP Models as Barriers for Persons with Disabilities
Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, Stephen Denuyl
TL;DR
The paper investigates how NLP systems encode biases about persons with disabilities, showing biases in toxicity and sentiment classifiers, embeddings, and data co-occurrence with stigmatizing topics. It employs perturbation-based tests, BERT embedding probes, and dataset analysis to reveal where biases originate and how they manifest downstream. The findings highlight that disability terms can be unfairly associated with toxicity and negative sentiment, and that social discourse topics around homelessness, gun violence, and drug use amplify these biases. The authors argue for socio-technical mitigation, transparency, and inclusive, stakeholder-driven approaches to reduce harm in real-world deployments.
Abstract
Building equitable and inclusive NLP technologies demands consideration of whether and how social attitudes are represented in ML models. In particular, representations encoded in models often inadvertently perpetuate undesirable social biases from the data on which they are trained. In this paper, we present evidence of such undesirable biases towards mentions of disability in two different English language models: toxicity prediction and sentiment analysis. Next, we demonstrate that the neural embeddings that are the critical first step in most NLP pipelines similarly contain undesirable biases towards mentions of disability. We end by highlighting topical biases in the discourse about disability which may contribute to the observed model biases; for instance, gun violence, homelessness, and drug addiction are over-represented in texts discussing mental illness.
