Table of Contents
Fetching ...

WildClaims: Information Access Conversations in the Wild(Chat)

Hideaki Joko, Shakiba Amirshahi, Charles L. A. Clarke, Faegheh Hasibi

TL;DR

The paper investigates how real-world interactions with LLMs yield information access beyond explicit user requests by analyzing the WildChat corpus and introducing the WildClaims dataset. It extracts and annotates 121,905 factual claims from 7,587 system utterances across 3,000 conversations, using two claim extraction methods $F_{Huo}$ and $F_{Song}$ and two check-worthiness classifiers $CW_{Majer}$ and $CW_{Hassan}$, with Union performance guiding prevalence estimates. The results reveal a substantial presence of check-worthy assertions even in non-information-seeking contexts, with conservative lower-bound estimates from $F_{Huo}$ at $18 eq$-$ ext{?}$ and $F_{Song}$ at $51 extrm{ ext{%}}$, and non-conservative estimates up to $76 extrm{ ext{%}}$ of conversations containing check-worthy content. The findings argue for a broader definition of conversational information access that encompasses implicit knowledge transfer and highlight the need for verification-aware evaluation and user simulators to better model real-world interactions with LLMs.

Abstract

The rapid advancement of Large Language Models (LLMs) has transformed conversational systems into practical tools used by millions. However, the nature and necessity of information retrieval in real-world conversations remain largely unexplored, as research has focused predominantly on traditional, explicit information access conversations. The central question is: What do real-world information access conversations look like? To this end, we first conduct an observational study on the WildChat dataset, large-scale user-ChatGPT conversations, finding that users' access to information occurs implicitly as check-worthy factual assertions made by the system, even when the conversation's primary intent is non-informational, such as creative writing. To enable the systematic study of this phenomenon, we release the WildClaims dataset, a novel resource consisting of 121,905 extracted factual claims from 7,587 utterances in 3,000 WildChat conversations, each annotated for check-worthiness. Our preliminary analysis of this resource reveals that conservatively 18% to 51% of conversations contain check-worthy assertions, depending on the methods employed, and less conservatively, as many as 76% may contain such assertions. This high prevalence underscores the importance of moving beyond the traditional understanding of explicit information access, to address the implicit information access that arises in real-world user-system conversations.

WildClaims: Information Access Conversations in the Wild(Chat)

TL;DR

The paper investigates how real-world interactions with LLMs yield information access beyond explicit user requests by analyzing the WildChat corpus and introducing the WildClaims dataset. It extracts and annotates 121,905 factual claims from 7,587 system utterances across 3,000 conversations, using two claim extraction methods and and two check-worthiness classifiers and , with Union performance guiding prevalence estimates. The results reveal a substantial presence of check-worthy assertions even in non-information-seeking contexts, with conservative lower-bound estimates from at - and at , and non-conservative estimates up to of conversations containing check-worthy content. The findings argue for a broader definition of conversational information access that encompasses implicit knowledge transfer and highlight the need for verification-aware evaluation and user simulators to better model real-world interactions with LLMs.

Abstract

The rapid advancement of Large Language Models (LLMs) has transformed conversational systems into practical tools used by millions. However, the nature and necessity of information retrieval in real-world conversations remain largely unexplored, as research has focused predominantly on traditional, explicit information access conversations. The central question is: What do real-world information access conversations look like? To this end, we first conduct an observational study on the WildChat dataset, large-scale user-ChatGPT conversations, finding that users' access to information occurs implicitly as check-worthy factual assertions made by the system, even when the conversation's primary intent is non-informational, such as creative writing. To enable the systematic study of this phenomenon, we release the WildClaims dataset, a novel resource consisting of 121,905 extracted factual claims from 7,587 utterances in 3,000 WildChat conversations, each annotated for check-worthiness. Our preliminary analysis of this resource reveals that conservatively 18% to 51% of conversations contain check-worthy assertions, depending on the methods employed, and less conservatively, as many as 76% may contain such assertions. This high prevalence underscores the importance of moving beyond the traditional understanding of explicit information access, to address the implicit information access that arises in real-world user-system conversations.

Paper Structure

This paper contains 14 sections, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Examples of conversations from the WildChat dataset, demonstrating how various tasks can generate responses with factual implications that require information retrieval and verification. None of these tasks are explicitly information seeking.
  • Figure 2: Prevalence of check-worthy utterances in the WildClaims dataset by task category, using $CW_{Union}$ as the classification method due to its highest correlation with human annotations (see Table \ref{['tab:cw-effectiveness']}).