Table of Contents
Fetching ...

Dialogues with AI Reduce Beliefs in Misinformation but Build No Lasting Discernment Skills

Anku Rani, Valdemar Danry, Paul Pu Liang, Andrew B. Lippman, Pattie Maes

TL;DR

This study investigates whether dialogues with an AI assistant can not only reduce beliefs in misinformation in the moment but also foster lasting misinformation-detection skills. Using a month-long, three-phase longitudinal design with 67 participants, the authors find a clear paradox: AI-assisted sessions boost immediate accuracy by about $+21.3$ percentage points, yet unaided performance on new items declines by about $-15.3$ percentage points by week 4, driven mainly by failures to detect fake content. Through quantitative analyses and conversation-level classifications, the work shows that certain AI strategies improve assisted accuracy (e.g., image-forensic cues, recalling prior knowledge) but often undermine learning unless balanced with prompts that promote independent reasoning. The findings have important implications for designing AI copilots in misinformation contexts, suggesting a need for interventions that simultaneously correct beliefs and cultivate durable discernment skills, such as Socratic prompting and reasoning-focused dialogue.

Abstract

Given the growing prevalence of fake information, including increasingly realistic AI-generated news, there is an urgent need to train people to better evaluate and detect misinformation. While interactions with AI have been shown to durably reduce people's beliefs in false information, it is unclear whether these interactions also teach people the skills to discern false information themselves. We conducted a month-long study where 67 participants classified news headline-image pairs as real or fake, discussed their assessments with an AI system, followed by an unassisted evaluation of unseen news items to measure accuracy before, during, and after AI assistance. While AI assistance produced immediate improvements during AI-assisted sessions (+21\% average), participants' unassisted performance on new items declined significantly by week 4 (-15.3\%). These results indicate that while AI may help immediately, it ultimately degrades long-term misinformation detection abilities.

Dialogues with AI Reduce Beliefs in Misinformation but Build No Lasting Discernment Skills

TL;DR

This study investigates whether dialogues with an AI assistant can not only reduce beliefs in misinformation in the moment but also foster lasting misinformation-detection skills. Using a month-long, three-phase longitudinal design with 67 participants, the authors find a clear paradox: AI-assisted sessions boost immediate accuracy by about percentage points, yet unaided performance on new items declines by about percentage points by week 4, driven mainly by failures to detect fake content. Through quantitative analyses and conversation-level classifications, the work shows that certain AI strategies improve assisted accuracy (e.g., image-forensic cues, recalling prior knowledge) but often undermine learning unless balanced with prompts that promote independent reasoning. The findings have important implications for designing AI copilots in misinformation contexts, suggesting a need for interventions that simultaneously correct beliefs and cultivate durable discernment skills, such as Socratic prompting and reasoning-focused dialogue.

Abstract

Given the growing prevalence of fake information, including increasingly realistic AI-generated news, there is an urgent need to train people to better evaluate and detect misinformation. While interactions with AI have been shown to durably reduce people's beliefs in false information, it is unclear whether these interactions also teach people the skills to discern false information themselves. We conducted a month-long study where 67 participants classified news headline-image pairs as real or fake, discussed their assessments with an AI system, followed by an unassisted evaluation of unseen news items to measure accuracy before, during, and after AI assistance. While AI assistance produced immediate improvements during AI-assisted sessions (+21\% average), participants' unassisted performance on new items declined significantly by week 4 (-15.3\%). These results indicate that while AI may help immediately, it ultimately degrades long-term misinformation detection abilities.

Paper Structure

This paper contains 54 sections, 17 figures, 1 table.

Figures (17)

  • Figure 1: Overview of our AI chatbot's interaction flow. Participants first indicate whether they have seen the news item before, then provide initial authenticity ratings (Step 1). The system then engages them in up to 9 rounds of persuasive dialogue about the item's authenticity (Step 2 shows one example of an exchange). Finally, participants provide updated ratings, allowing us to measure belief change from the AI interaction.
  • Figure 2: The system integrates three core components: (1) a React-based front-end for presenting news headline-image pairs and collecting participant responses (authenticity ratings and confidence levels), (2) OpenAI GPT-4o for generating contextually appropriate persuasive dialogue based on participant assessments, and (3) Google Custom Search API for real-time searching into the web with headline and fetching relevant information. The architecture supports iterative conversations with comprehensive data logging through Qualtrics and Google Sheets for subsequent analysis. Bidirectional communication between GPT-4o and the search API ensures responses incorporate current factual information, while the feedback loop enables up to 9 rounds of persuasive dialogue per news item.
  • Figure 3: Sample dataset from MirageNews huang2024miragenews containing news image-headline pairs with ground truth labels. The dataset includes both real news stories (left: aircraft wreckage recovery, right: President Biden) and fabricated content (center: fake World Economic Forum gathering, fake violent clashes), demonstrating the mixed real/fake nature of misinformation datasets used to evaluate AI detection capabilities.
  • Figure 4: Longitudinal study design workflow showing the three-phase experimental protocol. Participants complete Before AI (rates news item as Real or Fake and provides confidence rating before interacting with AI), AI Interaction (interacts for up to 9 rounds of conversations about the news item), With AI (re-rates the news item after conversation), and After AI (rates new unseen news items) phases across multiple sessions at weeks 0, 2, and 4. This design separates immediate AI assistance effects from independent skill development, enabling measurement of whether participants develop lasting misinformation detection abilities.
  • Figure 5: Quantitative results showing participant discernment accuracy of true and false information before, with, and after AI support across the four weeks. Left: Side-by-side comparison of accuracy differences for before AI support, with AI support, and after AI support (unassisted). Right: Line plot showing changes in accuracy over Week 0, Week 2, and Week 4. 95% confidence intervals.
  • ...and 12 more figures