The News Comment Gap and Algorithmic Agenda Setting in Online Forums

Flora Böwing; Patrick Gildersleve

The News Comment Gap and Algorithmic Agenda Setting in Online Forums

Flora Böwing, Patrick Gildersleve

TL;DR

It is found that journalists prefer positive, timely, complex, direct responses, while readers favour comments similar to article content from elite authors, which reflects the disparity between news stories valued by journalists and those preferred by readers.

Abstract

The disparity between news stories valued by journalists and those preferred by readers, known as the "News Gap", is well-documented. However, the difference in expectations regarding news related user-generated content is less studied. Comment sections, hosted by news websites, are popular venues for reader engagement, yet still subject to editorial decisions. It is thus important to understand journalist vs reader comment preferences and how these are served by various comment ranking algorithms that represent discussions differently. We analyse 1.2 million comments from Austrian newspaper Der Standard to understand the "News Comment Gap" and the effects of different ranking algorithms. We find that journalists prefer positive, timely, complex, direct responses, while readers favour comments similar to article content from elite authors. We introduce the versatile Feature-Oriented Ranking Utility Metric (FORUM) to assess the impact of different ranking algorithms and find dramatic differences in how they prioritise the display of comments by sentiment, topical relevance, lexical diversity, and readability. Journalists can exert substantial influence over the discourse through both curatorial and algorithmic means. Understanding these choices' implications is vital in fostering engaging and civil discussions while aligning with journalistic objectives, especially given the increasing legal scrutiny and societal importance of online discourse.

The News Comment Gap and Algorithmic Agenda Setting in Online Forums

TL;DR

Abstract

Paper Structure (44 sections, 4 equations, 15 figures, 21 tables)

This paper contains 44 sections, 4 equations, 15 figures, 21 tables.

Introduction
Related work
Significance of online debate for democracy
Motivations behind news engagement
Quality of comments
Institutional aspects of user-generated content
Information overload and ranking policies
Data and Methodology
Data
Feature Selection and Models
Features
Regression Analysis
XGBoost
Measuring the comment gap
Evaluating Comment Ranking Algorithms
...and 29 more sections

Figures (15)

Figure 1: Exemplary screenshot of the comment section underneath an article. Four tabs enable users to sort comments in different ways ("Alle postings" = "all postings"/reverse-chronological/newest first, "Älteste" = "oldest first"/chronological, "Plus" = most upvotes, "Minus" = most downvotes). In this case, one pinned comment is shown at the very top, followed by the input box and the first comment of the general sorting. Upvotes and downvotes are shown in the top-right corner of each comment in green and red, respectively. A user can reply to a comment by clicking "Antworten", upvote by clicking "+" and downvote by clicking "-". Usernames and text redacted for privacy reasons.
Figure 2: Cumulative feature score for the first $i$ comments presented to readers when comments are ranked by best possible (descending order), worst possible (ascending order), random, and example ranking policies.
Figure 3: The process of calculating FORUM score over n comments for three example policies. First the cumulative feature scores for the first $i$ comments returned by each policy are calculated (a). The performance above/below random is then normalised to the best/worst possible policies (b). Finally, the normalised scores are averaged to determine the FORUM score for how well the policy ranks up to $n$ comments (c). Policy 1 performs better than random, policy 2 performs close to random, and policy 3 performs worse than random.
Figure 4: Regression coefficients showing journalist and reader preferences for different comment characteristics (equivalent to model coefficients in the case of Editors' Picks, Upvotes, and Downvotes). 95% and 99% confidence intervals indicated (many obscured by point size). Full data in Tables \ref{['tab:pinvote-regressions']} and \ref{['tab:logodds-regression']}.
Figure 5: Regression coefficients showing journalist preferences relative to readers for different comment characteristics. 95% and 99% confidence intervals indicated (many obscured by point size). Full data in Table \ref{['tab:logodds-regression']}.
...and 10 more figures

The News Comment Gap and Algorithmic Agenda Setting in Online Forums

TL;DR

Abstract

The News Comment Gap and Algorithmic Agenda Setting in Online Forums

Authors

TL;DR

Abstract

Table of Contents

Figures (15)