Table of Contents
Fetching ...

"The Data Says Otherwise"-Towards Automated Fact-checking and Communication of Data Claims

Yu Fu, Shunan Guo, Jane Hoffswell, Victor S. Bursztyn, Ryan Rossi, John Stasko

TL;DR

The paper tackles misinformation arising from data-driven claims by introducing Aletheia, an automated fact-checking prototype that uses an LLM-based backend to map textual data claims to data facts, retrieve relevant evidence from datasets, and present results via data tables or visualizations. It adapts a six-component framework (data claim detection, text-to-data mapping, data evidence retrieval, veracity determination, data evidence presentation, and end-user interaction) to data claims, and implements a seven-step prompting pipeline to generate data fact specifications. Through a curated dataset of 400 claims across 10 data-fact types and a mixed-method user study with 20 participants, the authors show that visualization representations generally reduce review time and increase user confidence, while also revealing domain-specific design considerations. They offer four design recommendations for presenting data evidence and discuss limitations and future work, highlighting Aletheia’s potential to support data journalism and broader data-intensive communication while mitigating misinformation.

Abstract

Fact-checking data claims requires data evidence retrieval and analysis, which can become tedious and intractable when done manually. This work presents Aletheia, an automated fact-checking prototype designed to facilitate data claims verification and enhance data evidence communication. For verification, we utilize a pre-trained LLM to parse the semantics for evidence retrieval. To effectively communicate the data evidence, we design representations in two forms: data tables and visualizations, tailored to various data fact types. Additionally, we design interactions that showcase a real-world application of these techniques. We evaluate the performance of two core NLP tasks with a curated dataset comprising 400 data claims and compare the two representation forms regarding viewers' assessment time, confidence, and preference via a user study with 20 participants. The evaluation offers insights into the feasibility and bottlenecks of using LLMs for data fact-checking tasks, potential advantages and disadvantages of using visualizations over data tables, and design recommendations for presenting data evidence.

"The Data Says Otherwise"-Towards Automated Fact-checking and Communication of Data Claims

TL;DR

The paper tackles misinformation arising from data-driven claims by introducing Aletheia, an automated fact-checking prototype that uses an LLM-based backend to map textual data claims to data facts, retrieve relevant evidence from datasets, and present results via data tables or visualizations. It adapts a six-component framework (data claim detection, text-to-data mapping, data evidence retrieval, veracity determination, data evidence presentation, and end-user interaction) to data claims, and implements a seven-step prompting pipeline to generate data fact specifications. Through a curated dataset of 400 claims across 10 data-fact types and a mixed-method user study with 20 participants, the authors show that visualization representations generally reduce review time and increase user confidence, while also revealing domain-specific design considerations. They offer four design recommendations for presenting data evidence and discuss limitations and future work, highlighting Aletheia’s potential to support data journalism and broader data-intensive communication while mitigating misinformation.

Abstract

Fact-checking data claims requires data evidence retrieval and analysis, which can become tedious and intractable when done manually. This work presents Aletheia, an automated fact-checking prototype designed to facilitate data claims verification and enhance data evidence communication. For verification, we utilize a pre-trained LLM to parse the semantics for evidence retrieval. To effectively communicate the data evidence, we design representations in two forms: data tables and visualizations, tailored to various data fact types. Additionally, we design interactions that showcase a real-world application of these techniques. We evaluate the performance of two core NLP tasks with a curated dataset comprising 400 data claims and compare the two representation forms regarding viewers' assessment time, confidence, and preference via a user study with 20 participants. The evaluation offers insights into the feasibility and bottlenecks of using LLMs for data fact-checking tasks, potential advantages and disadvantages of using visualizations over data tables, and design recommendations for presenting data evidence.
Paper Structure (28 sections, 9 figures, 2 tables)

This paper contains 28 sections, 9 figures, 2 tables.

Figures (9)

  • Figure 1: An overview of our modified framework for automated data claim fact-checking and communication, based on Guo_2022_survey_automated_factchecking's framework Guo_2022_survey_automated_factchecking. The process begins by extracting data claims from data articles. These claims are mapped into data fact specifications designed to fetch pertinent evidence. This evidence not only aids in determining the veracity of the associated data claim but also serves as the justification for the verdict. The initial three components form a pipeline for the NLP tasks. This NLP pipeline (\ref{['fig:prompt_pipeline']}) underpins Aletheia's backend. The last three components connect to Aletheia's interface (\ref{['fig:Aletheia']}).
  • Figure 2: Overview of our LLM-based pipeline, which takes in a data article and outputs JSON specifications used to retrieve data evidence for each data claim. Seven steps are chained. The first five steps form the "pre-processing phase", which transforms the input text first into individual data claims (S1) with compound claim identification (S2), then into distinct data facts (S3) with coreference resolution (S4) and ellipsis resolution (S5). These data facts are further processed in the last two steps, the "core steps" of our prompt pipeline, which classify the types of data facts (S6) before converting them into data fact specifications used for retrieving the pertinent data evidence (S7).
  • Figure 3: Aletheia's interface. Users enter textual content and select/upload a reference dataset in Input View (A). The backend then detects data claims, retrieves corresponding data evidence, and verifies them. The fact-checking results are presented in Result View (B), utilizing color codings to signify their verdicts: accurate, inaccurate, and unverifiable. Users click on the highlighted data claims to access the Evidence View (C). This view contains the designed data evidence presentation and interactions.
  • Figure 4: Success rate of data fact specification transformation. Each row corresponds to a distinct data fact type. Boxes within the rows represent individual examples. Green boxes indicate successful transformations, where all transformed attributes and values match the ground truth. Red boxes represent examples with incomplete or incorrect conversions. The small rectangles below the red boxes represent partial match performance with the same color code. The average rate of complete matches is 89.5%.
  • Figure 5: Quantitative results from our user study with 20 participants comparing visualizations and tables as different data evidence presentation forms. (A) The distribution (box) and individual (point) time taken to assess the accuracy of thirteen distinct data facts. The x-axis represents the data fact types, while the y-axis indicates the duration in seconds. The two diverging bar charts show the average shift in (B) the viewers' confidence and (C) their preferences across the thirteen data fact types. Right-pointing bars signify that participants have greater confidence in their assessment when using the visualization, or they prefer to use visualizations for fact-checking the respective data facts. Conversely, left-pointing bars indicate greater confidence or preference for tables. Figures (D) and (E) display the percentage distribution for each response option regarding confidence shift (D) and preference (E). The length of the bars represents the percentage of each selection on the five-point scale. Gray bars represent Neutral. Orange bars represent Table while purple bars represent Visualization. Darker red and purple signify greater intensity (i.e., much more confident/strongly favor).
  • ...and 4 more figures