Table of Contents
Fetching ...

Missci: Reconstructing Fallacies in Misrepresented Science

Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych

TL;DR

Missci introduces a formal argumentation framework and a health misinformation dataset to reconstruct and verbalize the fallacious reasoning used to misrepresent scientific publications. The approach emphasizes the bridge between an accurate premise and an incorrect claim via fallacious premises, requiring explicit explanation of the used fallacies. Evaluations on two large language models show GPT-4 substantially outperforms a open-source alternative in fallacy identification and premise generation, though the task remains challenging and often depends on the fallacy class. The work provides a testbed for evaluating critical reasoning and has implications for improving automatic debunking and digital literacy while acknowledging ethical considerations and limitations in real-world deployment.

Abstract

Health-related misinformation on social networks can lead to poor decision-making and real-world dangers. Such misinformation often misrepresents scientific publications and cites them as "proof" to gain perceived credibility. To effectively counter such claims automatically, a system must explain how the claim was falsely derived from the cited publication. Current methods for automated fact-checking or fallacy detection neglect to assess the (mis)used evidence in relation to misinformation claims, which is required to detect the mismatch between them. To address this gap, we introduce Missci, a novel argumentation theoretical model for fallacious reasoning together with a new dataset for real-world misinformation detection that misrepresents biomedical publications. Unlike previous fallacy detection datasets, Missci (i) focuses on implicit fallacies between the relevant content of the cited publication and the inaccurate claim, and (ii) requires models to verbalize the fallacious reasoning in addition to classifying it. We present Missci as a dataset to test the critical reasoning abilities of large language models (LLMs), that are required to reconstruct real-world fallacious arguments, in a zero-shot setting. We evaluate two representative LLMs and the impact of different levels of detail about the fallacy classes provided to the LLM via prompts. Our experiments and human evaluation show promising results for GPT 4, while also demonstrating the difficulty of this task.

Missci: Reconstructing Fallacies in Misrepresented Science

TL;DR

Missci introduces a formal argumentation framework and a health misinformation dataset to reconstruct and verbalize the fallacious reasoning used to misrepresent scientific publications. The approach emphasizes the bridge between an accurate premise and an incorrect claim via fallacious premises, requiring explicit explanation of the used fallacies. Evaluations on two large language models show GPT-4 substantially outperforms a open-source alternative in fallacy identification and premise generation, though the task remains challenging and often depends on the fallacy class. The work provides a testbed for evaluating critical reasoning and has implications for improving automatic debunking and digital literacy while acknowledging ethical considerations and limitations in real-world deployment.

Abstract

Health-related misinformation on social networks can lead to poor decision-making and real-world dangers. Such misinformation often misrepresents scientific publications and cites them as "proof" to gain perceived credibility. To effectively counter such claims automatically, a system must explain how the claim was falsely derived from the cited publication. Current methods for automated fact-checking or fallacy detection neglect to assess the (mis)used evidence in relation to misinformation claims, which is required to detect the mismatch between them. To address this gap, we introduce Missci, a novel argumentation theoretical model for fallacious reasoning together with a new dataset for real-world misinformation detection that misrepresents biomedical publications. Unlike previous fallacy detection datasets, Missci (i) focuses on implicit fallacies between the relevant content of the cited publication and the inaccurate claim, and (ii) requires models to verbalize the fallacious reasoning in addition to classifying it. We present Missci as a dataset to test the critical reasoning abilities of large language models (LLMs), that are required to reconstruct real-world fallacious arguments, in a zero-shot setting. We evaluate two representative LLMs and the impact of different levels of detail about the fallacy classes provided to the LLM via prompts. Our experiments and human evaluation show promising results for GPT 4, while also demonstrating the difficulty of this task.
Paper Structure (65 sections, 3 equations, 25 figures, 15 tables)

This paper contains 65 sections, 3 equations, 25 figures, 15 tables.

Figures (25)

  • Figure 1: Fallacious Argument Reconstruction: The claim is falsely derived from the cited study (green) by relying on the content of $p_0$. The model generates and classifies the fallacious reasoning (orange) that needs to be applied when concluding the claim based on all relevant study content (including $s_1$ and $s_2$).
  • Figure 2: Interchangeable Fallacies: On the left, no distinction between the different "spike proteins" (from the vaccine or the virus) is made; on the right, both are assumed to behave alike. Only one of these premises is needed to bridge the reasoning gap.
  • Figure 3: Interchangeable Fallacy Classes: Heatmap of co-occurring interchangeable fallacy classes of the consolidated arguments ordered by frequency.
  • Figure 4: Performance per Fallacy: F1-score per predicted fallacy class from a multi-label multi-class perspective considering all model predictions.
  • Figure 5: Relaxed Fallacy Detection: Performance when a fallacy is considered as correct, if the model predicts a fallacy class within the top k results that matches any of the gold interchangeable fallacy classes.
  • ...and 20 more figures