Table of Contents
Fetching ...

People's Perceptions Toward Bias and Related Concepts in Large Language Models: A Systematic Review

Lu Wang, Max Song, Rezvaneh Rezapour, Bum Chul Kwon, Jina Huh-Yoo

TL;DR

This study addresses how people perceive bias and related concepts in large language models (LLMs) by conducting a PRISMA-guided systematic review of 231 records from ACM Digital Library and ACL Anthology, culminating in 15 papers with human evaluators (plus 5 added via backward snowballing). It reveals that explicit definitions of bias are rare, biases arise from both human and model sources, and perceptions vary across tasks, domains, and users, crystallizing into four application areas and a range of concerns including regulation and overtrust. The work highlights gaps in definitions, measurement, and demographic reporting, and argues for taxonomy-driven, standardized methodologies to study human-LLM interactions. These insights inform CHI researchers and practitioners about designing user-centered, transparent, and safer LLM-enabled systems and motivate further empirical work to systematize how perceptions of bias evolve with evolving AI capabilities.

Abstract

Large language models (LLMs) have brought breakthroughs in tasks including translation, summarization, information retrieval, and language generation, gaining growing interest in the CHI community. Meanwhile, the literature shows researchers' controversial perceptions about the efficacy, ethics, and intellectual abilities of LLMs. However, we do not know how people perceive LLMs that are pervasive in everyday tools, specifically regarding their experience with LLMs around bias, stereotypes, social norms, or safety. In this study, we conducted a systematic review to understand what empirical insights papers have gathered about people's perceptions toward LLMs. From a total of 231 retrieved papers, we full-text reviewed 15 papers that recruited human evaluators to assess their experiences with LLMs. We report different biases and related concepts investigated by these studies, four broader LLM application areas, the evaluators' perceptions toward LLMs' performances including advantages, biases, and conflicting perceptions, factors influencing these perceptions, and concerns about LLM applications.

People's Perceptions Toward Bias and Related Concepts in Large Language Models: A Systematic Review

TL;DR

This study addresses how people perceive bias and related concepts in large language models (LLMs) by conducting a PRISMA-guided systematic review of 231 records from ACM Digital Library and ACL Anthology, culminating in 15 papers with human evaluators (plus 5 added via backward snowballing). It reveals that explicit definitions of bias are rare, biases arise from both human and model sources, and perceptions vary across tasks, domains, and users, crystallizing into four application areas and a range of concerns including regulation and overtrust. The work highlights gaps in definitions, measurement, and demographic reporting, and argues for taxonomy-driven, standardized methodologies to study human-LLM interactions. These insights inform CHI researchers and practitioners about designing user-centered, transparent, and safer LLM-enabled systems and motivate further empirical work to systematize how perceptions of bias evolve with evolving AI capabilities.

Abstract

Large language models (LLMs) have brought breakthroughs in tasks including translation, summarization, information retrieval, and language generation, gaining growing interest in the CHI community. Meanwhile, the literature shows researchers' controversial perceptions about the efficacy, ethics, and intellectual abilities of LLMs. However, we do not know how people perceive LLMs that are pervasive in everyday tools, specifically regarding their experience with LLMs around bias, stereotypes, social norms, or safety. In this study, we conducted a systematic review to understand what empirical insights papers have gathered about people's perceptions toward LLMs. From a total of 231 retrieved papers, we full-text reviewed 15 papers that recruited human evaluators to assess their experiences with LLMs. We report different biases and related concepts investigated by these studies, four broader LLM application areas, the evaluators' perceptions toward LLMs' performances including advantages, biases, and conflicting perceptions, factors influencing these perceptions, and concerns about LLM applications.
Paper Structure (29 sections, 2 figures, 1 table)