The Language of Approval: Identifying the Drivers of Positive Feedback Online
Agam Goyal, Charlotte Lambert, Eshwar Chandrasekharan
TL;DR
The paper investigates what linguistic attributes causally drive positive feedback on Reddit by analyzing $11{,}000{,}000$ posts across $100$ subreddits. Using a selection-on-observables causal framework with risk-stratified matching and fixed effects, it isolates the impact of textual features on three reward signals: score, awards, and gold, while controlling for author reputation, timing, and community context. It then demonstrates that these same features yield strong predictive power for surfacing desirable posts in real time, with local subreddit models outperforming a global model in many cases (mean AUC $0.726$ versus $0.654$ globally). An audit against surveys and guidelines reveals a policy-practice gap: guidelines focus on civility and formatting, not on empirically supported linguistic strategies that boost positive reception. The work advances theory and practice by offering a rich, causally-informed feature set, actionable guidance for formation-oriented guidelines, and a framework for proactive moderation that emphasizes positive reinforcement over purely punitive strategies.
Abstract
Positive feedback via likes and awards is central to online governance, yet which attributes of users' posts elicit rewards -- and how these vary across authors and communities -- remains unclear. To examine this, we combine quasi-experimental causal inference with predictive modeling on 11M posts from 100 subreddits. We identify linguistic patterns and stylistic attributes causally linked to rewards, controlling for author reputation, timing, and community context. For example, overtly complicated language, tentative style, and toxicity reduce rewards. We use our set of curated features to train models that can detect highly-upvoted posts with high AUC. Our audit of community guidelines highlights a ``policy-practice gap'' -- most rules focus primarily on civility and formatting requirements, with little emphasis on the attributes identified to drive positive feedback. These results inform the design of community guidelines, support interfaces that teach users how to craft desirable contributions, and moderation workflows that emphasize positive reinforcement over purely punitive enforcement.
