Invisible failures in human-AI interactions

Christopher Potts; Moritz Sudhof

Invisible failures in human-AI interactions

Christopher Potts, Moritz Sudhof

Abstract

AI systems fail silently far more often than they fail visibly. In a large-scale quantitative analysis of human-AI interactions from the WildChat dataset, we find that 78% of AI failures are invisible: something went wrong but the user gave no overt indication that there was a problem. These invisible failures cluster into eight archetypes that help us characterize where and how AI systems are failing to meet users' needs. In addition, the archetypes show systematic co-occurrence patterns indicating higher-level failure types. To address the question of whether these archetypes will remain relevant as AI systems become more capable, we also assess failures for whether they are primarily interactional or capability-driven, finding that 91% involve interactional dynamics, and we estimate that 94% of such failures would persist even with a more capable model. Finally, we illustrate how the archetypes help us to identify systematic and variable AI limitations across different usage domains. Overall, we argue that our invisible failure taxonomy can be a key component in reliable failure monitoring for product developers, scientists, and policy makers. Our code and data are available at https://github.com/bigspinai/bigspin-invisible-failure-archetypes

Invisible failures in human-AI interactions

Abstract

Paper Structure (34 sections, 2 equations, 11 figures, 1 table)

This paper contains 34 sections, 2 equations, 11 figures, 1 table.

Introduction
Data
Cohort selection
Annotation process
The quality landscape
Methods
Overall quality category distribution
Signal tag distribution
The relationship between quality category and signal density
The relationship between quality category and signal tags
The acceptable tier as a hidden risk
Invisible failures
Methods
Invisible failure archetype distribution
The relationship between failure archetypes and quality labels
...and 19 more sections

Figures (11)

Figure 1: The quality landscape.
Figure 2: Invisible failure archetype distribution. The bar sizes and counts indicate the frequency of the archetype. Since individual transcripts can manifest multiple archetypes, the percentages are the percent of failure transcripts labeled with that archetype.
Figure 3: Archetypes and quality. We show the distribution of quality categories per archetype. The archetypes are strongly associated with poor and critical quality ratings. The only apparent exception is The mystery failure, but this category may contain serious failures that our annotations do not properly capture.
Figure 4: Archetype co-occurrence. The cells give PPMI values, with darker purple indicating higher PPMI values (stronger associations). The co-occurrence patterns point to higher-level failure patterns.
Figure 5: Archetype--domain co-occurrence. The cell values give PPMI values, with darker blues indicating stronger associations. Different domains have different archetype associations, pointing to different underlying challenges for AI systems.
...and 6 more figures

Invisible failures in human-AI interactions

Abstract

Invisible failures in human-AI interactions

Authors

Abstract

Table of Contents

Figures (11)