Emerging Reliance Behaviors in Human-AI Content Grounded Data Generation: The Role of Cognitive Forcing Functions and Hallucinations

Zahra Ashktorab; Qian Pan; Werner Geyer; Michael Desmond; Marina Danilevsky; James M. Johnson; Casey Dugan; Michelle Bachman

Emerging Reliance Behaviors in Human-AI Content Grounded Data Generation: The Role of Cognitive Forcing Functions and Hallucinations

Zahra Ashktorab, Qian Pan, Werner Geyer, Michael Desmond, Marina Danilevsky, James M. Johnson, Casey Dugan, Michelle Bachman

TL;DR

This study investigates how hallucinations and Cognitive Forcing Functions (CFFs) affect the quality and reliance patterns in human-AI co-creation of content-grounded data for fine-tuning LLMs in HR/customer-support contexts. Using a mixed between-within design with 34 participants across 8 tasks, the authors manipulate CFF type and presence alongside hallucination presence, employing a rubric-based evaluation of faithfulness, accuracy, completeness, and AI usage. They find that hallucinations substantially degrade data quality and that CFFs do not reliably mitigate this effect, though they influence how users engage with AI suggestions and create novel reliance behaviors (e.g., appending AI content to correct answers). The results yield a nuanced view of AI reliance in co-creative tasks, highlight the need for conditional CFF deployment, and propose a taxonomy of reliers along with a practical data-quality rubric for improving AI-assisted data generation. These findings have practical implications for designing data-collection pipelines and evaluation schemes to produce higher-quality fine-tuning data for content-grounded LLMs in organizational settings.

Abstract

We investigate the impact of hallucinations and Cognitive Forcing Functions in human-AI collaborative content-grounded data generation, focusing on the use of Large Language Models (LLMs) to assist in generating high quality conversational data. Through a study with 34 users who each completed 8 tasks (n=272), we found that hallucinations significantly reduce data quality. While Cognitive Forcing Functions do not always alleviate these effects, their presence influences how users integrate AI responses. Specifically, we observed emerging reliance behaviors, with users often appending AI-generated responses to their correct answers, even when the AI's suggestions conflicted. This points to a potential drawback of Cognitive Forcing Functions, particularly when AI suggestions are inaccurate. Users who overrelied on AI-generated text produced lower quality data, emphasizing the nuanced dynamics of overreliance in human-LLM collaboration compared to traditional human-AI decision-making.

Emerging Reliance Behaviors in Human-AI Content Grounded Data Generation: The Role of Cognitive Forcing Functions and Hallucinations

TL;DR

Abstract

Paper Structure (34 sections, 6 figures, 7 tables)

This paper contains 34 sections, 6 figures, 7 tables.

Introduction
Related Work
Human AI Collaboration and Decision Making
LLMs
Faithfulness Evaluation
Reliance on AI
Cognitive Forcing Functions
Background
Co-creating Content Grounded Data for AI-Assisted Customer Support
Defining Hallucinations
Hypotheses
Methodology
Participants
Procedure
Experimental Setup and Artifacts
...and 19 more sections

Figures (6)

Figure 1: Flow of tasks presented to each user. Each participant was assigned to one of the CFF conditions (Formulate, Highlight, Read First) and then completed 8 tasks with ordering randomized.
Figure 2: Cognitive Forcing Functions. 1) Reference Document, 2) Textbox for user to submit final response, A) Represents the interaction in the Formulate conditions in which a user is asked to first respond to the question without an AI suggestions. B) Response box corresponding to Highlight Condition, C) Read First condition: User first sees reference document before seeing respective chat and customer query.
Figure 3: Task overview presented to participants before completing the eight tasks. 1) Refers to the final response submitted by user, 2) Reference document, 3) AI-Generated Suggestion. 4) Refers to copy button presented to users.
Figure 4: Use of AI by Cognitive Forcing Function Type. When the Cognitive Forcing Function was absent, participants in the Formulate condition used AI significantly more than those in the Highlight condition.
Figure 5: Impact of Hallucinations on Data Quality and AI Use. Data quality decreases (left) and AI reliance drops (right) when hallucinations are present.
...and 1 more figures

Emerging Reliance Behaviors in Human-AI Content Grounded Data Generation: The Role of Cognitive Forcing Functions and Hallucinations

TL;DR

Abstract

Emerging Reliance Behaviors in Human-AI Content Grounded Data Generation: The Role of Cognitive Forcing Functions and Hallucinations

Authors

TL;DR

Abstract

Table of Contents

Figures (6)