Do they mean 'us'? Interpreting Referring Expressions in Intergroup Bias
Venkata S Govindarajan, Matianyu Zang, Kyle Mahowald, David Beaver, Junyi Jessy Li
TL;DR
Do they mean '$us$'? Interpreting Referring Expressions in Intergroup Bias develops a data-driven framework to study intergroup bias in natural language by tagging referring expressions in NFL game-thread comments grounded in live win probabilities ($WP$). It builds a parallel corpus of over $6$ million comments from 32 NFL team subreddits, ground them in $WP$ per play, and labels references as in-group ($[in]$), out-group ($[out]$), or other ($[other]$), with expert and crowd validation. The study shows that as $WP$ for the in-group increases, referential language increasingly abstracts away from the in-group and shifts toward the out-group or implicit references, following a near-linear pattern across $WP$ windows; GPT-4o benefits from linguistic $WP$ prompts, while finetuned Llama-3-8B achieves strong performance in standard tagging. The work enables large-scale sociolinguistic analysis, highlights the nuanced manifestations of the Linguistic Intergroup Bias in naturalistic sports talk, and provides code and data for replication.
Abstract
The variations between in-group and out-group speech (intergroup bias) are subtle and could underlie many social phenomena like stereotype perpetuation and implicit bias. In this paper, we model the intergroup bias as a tagging task on English sports comments from forums dedicated to fandom for NFL teams. We curate a unique dataset of over 6 million game-time comments from opposing perspectives (the teams in the game), each comment grounded in a non-linguistic description of the events that precipitated these comments (live win probabilities for each team). Expert and crowd annotations justify modeling the bias through tagging of implicit and explicit referring expressions and reveal the rich, contextual understanding of language and the world required for this task. For large-scale analysis of intergroup variation, we use LLMs for automated tagging, and discover that some LLMs perform best when prompted with linguistic descriptions of the win probability at the time of the comment, rather than numerical probability. Further, large-scale tagging of comments using LLMs uncovers linear variations in the form of referent across win probabilities that distinguish in-group and out-group utterances. Code and data are available at https://github.com/venkatasg/intergroup-nfl .
