Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries

Roberto Ceraolo; Dmitrii Kharlapenko; Ahmad Khan; Amélie Reymond; Punya Syon Pandey; Rada Mihalcea; Bernhard Schölkopf; Mrinmaya Sachan; Zhijing Jin

Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries

Roberto Ceraolo, Dmitrii Kharlapenko, Ahmad Khan, Amélie Reymond, Punya Syon Pandey, Rada Mihalcea, Bernhard Schölkopf, Mrinmaya Sachan, Zhijing Jin

TL;DR

Quriosity addresses how humans pose questions driven by curiosity in the age of large language models. By assembling a 13,500-question corpus from H-to-SE, H-to-H, and H-to-LLM channels, the work reveals that a substantial fraction of curiosity-driven queries are causal ($\approx$ $42\%$) and analyzes their linguistic and cognitive properties. The paper introduces an iterative prompt-improvement framework to identify causal questions, trains efficient causal-question routers, and benchmarks LLMs on this data, showing that current systems struggle with anticipatory causal reasoning and tend to be verbose. The dataset, analysis, and baseline models provide a foundation for improved open-ended chatbot interactions and robust routing of causal inquiries in AI-assisted information seeking.

Abstract

Recent progress in Large Language Model (LLM) technology has changed our role in interacting with these models. Instead of primarily testing these models with questions we already know answers to, we are now using them for queries where the answers are unknown to us, driven by human curiosity. This shift highlights the growing need to understand curiosity-driven human questions - those that are more complex, open-ended, and reflective of real-world needs. To this end, we present Quriosity, a collection of 13.5K naturally occurring questions from three diverse sources: human-to-search-engine queries, human-to-human interactions, and human-to-LLM conversations. Our comprehensive collection enables a rich understanding of human curiosity across various domains and contexts. Our analysis reveals a significant presence of causal questions (up to 42%) in the dataset, for which we develop an iterative prompt improvement framework to identify all causal queries and examine their unique linguistic properties, cognitive complexity and source distribution. Our paper paves the way for future work on causal question identification and open-ended chatbot interactions. Our code and data are at https://github.com/roberto-ceraolo/quriosity.

Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries

TL;DR

) and analyzes their linguistic and cognitive properties. The paper introduces an iterative prompt-improvement framework to identify causal questions, trains efficient causal-question routers, and benchmarks LLMs on this data, showing that current systems struggle with anticipatory causal reasoning and tend to be verbose. The dataset, analysis, and baseline models provide a foundation for improved open-ended chatbot interactions and robust routing of causal inquiries in AI-assisted information seeking.

Abstract

Paper Structure (66 sections, 9 figures, 19 tables)

This paper contains 66 sections, 9 figures, 19 tables.

Introduction
Dataset Construction
Question Sources
Data Statistics
Overall.
Topic Coverage.
Comparison to Existing Datasets
Exploring Curiosity-Driven Queries
How Do Natural Inquiries Differ from Curated Test Questions?
Methods.
Results.
Does Human Question Behavior Vary across Channels?
Inferred User Needs
Cognitive Complexity
Knowledge Domain
...and 51 more sections

Figures (9)

Figure 1: T-SNE visualisation of the main topic clusters in our Quriosity dataset. Cluster 1: Daily life. Cluster 2: Computer-related. Cluster 3: Sports, medicine, and science. Cluster 4: Prompt Questions. Cluster 5: Stories and fictional characters. See examples for each cluster in \ref{['tab:cluster-examples']}.
Figure 2: User needs across sources of H-to-SE, H-to-H, and H-to-LLM interactions.
Figure 3: Cognitive skill distribution in causal vs. non-causal questions, showing a more balanced distribution across causal inquiries.
Figure 4: Combined Precision - Recall curve for the FLAN models.
Figure 5: Combined ROC curve for the FLAN models.
...and 4 more figures

Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries

TL;DR

Abstract

Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries

Authors

TL;DR

Abstract

Table of Contents

Figures (9)