Table of Contents
Fetching ...

Effects of Generative AI Errors on User Reliance Across Task Difficulty

Jacy Reese Anthis, Hannah Cha, Solon Barocas, Alexandra Chouldechova, Jake Hofman

Abstract

The capabilities of artificial intelligence (AI) lie along a jagged frontier, where AI systems surprisingly fail on tasks that humans find easy and succeed on tasks that humans find hard. To investigate user reactions to this phenomenon, we developed an incentive-compatible experimental methodology based on diagram generation tasks, in which we induce errors in generative AI output and test effects on user reliance. We demonstrate the interface in a preregistered 3x2 experiment (N = 577) with error rates of 10%, 30%, or 50% on easier or harder diagram generation tasks. We confirmed that observing more errors reduces use, but we unexpectedly found that easy-task errors did not significantly reduce use more than hard-task errors, suggesting that people are not averse to jaggedness in this experimental setting. We encourage future work that varies task difficulty at the same time as other features of AI errors, such as whether the jagged error patterns are easily learned.

Effects of Generative AI Errors on User Reliance Across Task Difficulty

Abstract

The capabilities of artificial intelligence (AI) lie along a jagged frontier, where AI systems surprisingly fail on tasks that humans find easy and succeed on tasks that humans find hard. To investigate user reactions to this phenomenon, we developed an incentive-compatible experimental methodology based on diagram generation tasks, in which we induce errors in generative AI output and test effects on user reliance. We demonstrate the interface in a preregistered 3x2 experiment (N = 577) with error rates of 10%, 30%, or 50% on easier or harder diagram generation tasks. We confirmed that observing more errors reduces use, but we unexpectedly found that easy-task errors did not significantly reduce use more than hard-task errors, suggesting that people are not averse to jaggedness in this experimental setting. We encourage future work that varies task difficulty at the same time as other features of AI errors, such as whether the jagged error patterns are easily learned.

Paper Structure

This paper contains 13 sections, 8 figures.

Figures (8)

  • Figure 1: Screenshots of the study interface. In Phase 1 (left), participants predict whether the AI tool will successfully generate the diagram. In Phase 2 (right), participants report their WTP for an opportunity to use the AI tool for a monetary reward.
  • Figure 2: Bids across the six experimental conditions. Means are model-adjusted, and error bars show standard errors.
  • Figure 3: The easiest (i.e., simplest) (A) and hardest (i.e., most complex) (B) diagrams that participants were asked to recreate in Phase 2.
  • Figure 4: First stages of the study interface. This participant was randomly assigned to view the Phase 1 tasks in ascending order of difficulty (easier to harder) and to see errors later in the phase (for the more difficult tasks, not shown here). The Phase 2 task order is fully randomized.
  • Figure 5: Additional stages of the study interface. This participant was randomly assigned to view the Phase 1 tasks in ascending order of difficulty (easier to harder) and to see errors later in the phase (for the more difficult tasks, not shown here). The Phase 2 task order is fully randomized.
  • ...and 3 more figures