Along the Margins: Marginalized Communities' Ethical Concerns about Social Platforms
Lauren Olson, Emitzá Guzmán, Florian Kunneman
TL;DR
The paper investigates how marginalized communities perceive ethical concerns on social platforms by building a large Reddit-based dataset from 586 subreddits, manually labeling a subset for ethical issues, and training NLP classifiers to auto-detect and categorize these concerns. It finds that discrimination and misrepresentation are the dominant themes, with substantial variation across communities and platforms, and demonstrates that automated methods can identify and classify concerns with competitive accuracy. The work highlights the value of disaggregated, data-driven feedback in guiding more equitable software development and platform governance, while acknowledging limitations in scope and generalizability. Overall, the study provides a scalable framework for surfacing and acting on marginalized users' ethical concerns to inform design justice and responsible platform stewardship.
Abstract
In this paper, we identified marginalized communities' ethical concerns about social platforms. We performed this identification because recent platform malfeasance indicates that software teams prioritize shareholder concerns over user concerns. Additionally, these platform shortcomings often have devastating effects on marginalized populations. We first scraped 586 marginalized communities' subreddits, aggregated a dataset of their social platform mentions and manually annotated mentions of ethical concerns in these data. We subsequently analyzed trends in the manually annotated data and tested the extent to which ethical concerns can be automatically classified by means of natural language processing (NLP). We found that marginalized communities' ethical concerns predominantly revolve around discrimination and misrepresentation, and reveal deficiencies in current software development practices. As such, researchers and developers could use our work to further investigate these concerns and rectify current software flaws.
