Table of Contents
Fetching ...

Reliability Estimation of News Media Sources: Birds of a Feather Flock Together

Sergio Burdisso, Dairazalia Sánchez-Cortés, Esaú Villatoro-Tello, Petr Motlicek

TL;DR

A novel approach for source reliability estimation that leverages reinforcement learning strategies for estimating the reliability degree of news sources based on how all the news media sources interact with each other on the Web is introduced.

Abstract

Evaluating the reliability of news sources is a routine task for journalists and organizations committed to acquiring and disseminating accurate information. Recent research has shown that predicting sources' reliability represents an important first-prior step in addressing additional challenges such as fake news detection and fact-checking. In this paper, we introduce a novel approach for source reliability estimation that leverages reinforcement learning strategies for estimating the reliability degree of news sources. Contrary to previous research, our proposed approach models the problem as the estimation of a reliability degree, and not a reliability label, based on how all the news media sources interact with each other on the Web. We validated the effectiveness of our method on a news media reliability dataset that is an order of magnitude larger than comparable existing datasets. Results show that the estimated reliability degrees strongly correlates with journalists-provided scores (Spearman=0.80) and can effectively predict reliability labels (macro-avg. F$_1$ score=81.05). We release our implementation and dataset, aiming to provide a valuable resource for the NLP community working on information verification.

Reliability Estimation of News Media Sources: Birds of a Feather Flock Together

TL;DR

A novel approach for source reliability estimation that leverages reinforcement learning strategies for estimating the reliability degree of news sources based on how all the news media sources interact with each other on the Web is introduced.

Abstract

Evaluating the reliability of news sources is a routine task for journalists and organizations committed to acquiring and disseminating accurate information. Recent research has shown that predicting sources' reliability represents an important first-prior step in addressing additional challenges such as fake news detection and fact-checking. In this paper, we introduce a novel approach for source reliability estimation that leverages reinforcement learning strategies for estimating the reliability degree of news sources. Contrary to previous research, our proposed approach models the problem as the estimation of a reliability degree, and not a reliability label, based on how all the news media sources interact with each other on the Web. We validated the effectiveness of our method on a news media reliability dataset that is an order of magnitude larger than comparable existing datasets. Results show that the estimated reliability degrees strongly correlates with journalists-provided scores (Spearman=0.80) and can effectively predict reliability labels (macro-avg. F score=81.05). We release our implementation and dataset, aiming to provide a valuable resource for the NLP community working on information verification.
Paper Structure (26 sections, 6 equations, 8 figures, 7 tables, 3 algorithms)

This paper contains 26 sections, 6 equations, 8 figures, 7 tables, 3 algorithms.

Figures (8)

  • Figure 1: Performance variation across searched values of $n$ (left side) and $\gamma$ (right side) on the ExpsetB (solid line) and ExpsetB$^-$ (dashed line) datasets. The lines represent the mean values across the 5 folds, and $95\%$ confidence intervals are depicted. Markers highlight selected hyperparameter values.
  • Figure 2: News media graph built from all four CC-News snapshot (only English-speaking sources) and used for experimentation.
  • Figure 3: 5-fold cross-validation results obtained on the ExpsetB dataset with the two best strategies, P-Reliability and I-Reliability, using different graphs. The x-axis represents the CC-News snapshot used to build the graph, and the y-axis the macro averaged F$_1$ score.
  • Figure 4: Scatter plot showing the correlation between the rankings obtained by PageRank values (y-axis) and News Guard scores (x-axis).
  • Figure 5: Scatter plot showing the correlation between the rankings obtained by F-Reliability values (y-axis) and News Guard scores (x-axis). Left side without rewards and right side with rewards.
  • ...and 3 more figures