Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure

Siddharth Swaroop; Zana Buçinca; Krzysztof Z. Gajos; Finale Doshi-Velez

Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure

Siddharth Swaroop, Zana Buçinca, Krzysztof Z. Gajos, Finale Doshi-Velez

TL;DR

This work addresses the problem of achieving high decision accuracy without sacrificing speed in AI-assisted settings under time pressure. It adopts two controlled experiments employing four AI assistance types (No-AI, AI-before, AI-after, Mixed) and time-pressure manipulations to map accuracy-time tradeoffs across tasks of varying difficulty. Key findings show that time pressure alters the relative benefits of AI assistances (AI-before becoming fastest with higher overreliance, scarcity effects dissipating under pressure) and that overreliance is a detectable, somewhat stable individual tendency that can predict subsequent behavior. The study also provides exploratory evidence that adapting AI assistance to both user traits (overreliance propensity) and task properties (difficulty) can enhance human-AI complementarity, particularly under time pressure, with implications for designing adaptive, context-aware AI decision-support tools. Overall, the results emphasize the need to consider time pressure when evaluating AI assistances and support adaptive strategies that tailor AI presentation to the user and task to optimize the accuracy-time balance in real-world, time-constrained settings.

Abstract

In settings where users both need high accuracy and are time-pressured, such as doctors working in emergency rooms, we want to provide AI assistance that both increases decision accuracy and reduces decision-making time. Current literature focusses on how users interact with AI assistance when there is no time pressure, finding that different AI assistances have different benefits: some can reduce time taken while increasing overreliance on AI, while others do the opposite. The precise benefit can depend on both the user and task. In time-pressured scenarios, adapting when we show AI assistance is especially important: relying on the AI assistance can save time, and can therefore be beneficial when the AI is likely to be right. We would ideally adapt what AI assistance we show depending on various properties (of the task and of the user) in order to best trade off accuracy and time. We introduce a study where users have to answer a series of logic puzzles. We find that time pressure affects how users use different AI assistances, making some assistances more beneficial than others when compared to no-time-pressure settings. We also find that a user's overreliance rate is a key predictor of their behaviour: overreliers and not-overreliers use different AI assistance types differently. We find marginal correlations between a user's overreliance rate (which is related to the user's trust in AI recommendations) and their personality traits (Big Five Personality traits). Overall, our work suggests that AI assistances have different accuracy-time tradeoffs when people are under time pressure compared to no time pressure, and we explore how we might adapt AI assistances in this setting.

Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure

TL;DR

Abstract

Paper Structure (31 sections, 5 figures, 4 tables)

This paper contains 31 sections, 5 figures, 4 tables.

Introduction
Related Work
Decision-Making Under Time Pressure
Accuracy, Reliance, and Time in AI-Assisted Decision-Making
Experiment 1: Time pressure impacts behaviour
Task description
Conditions
Procedure
Design and analysis
Results
Time pressure impacts how participants use AI assistance types differently (hypothesis H1)
We can predict a participant's overreliance rate (hypothesis H2)
Experiment 2: Assigning different participants to time pressure or no time pressure
Task description and conditions
Procedure
...and 16 more sections

Figures (5)

Figure 1: The alien prescription task, where participants must prescribe a single medicine. The information about the alien includes the alien's unique treatment plan (a set of rules) and the alien's observed symptoms. Participants have to use these observed symptoms and rules to prescribe a single medicine, such that only the observed symptoms and any potential intermediate (green) symptoms are used, and no other unobserved symptoms. When an AI assistance is shown, it is shown in a red box, like in this example. Here, the AI recommendation is the best possible (tranquilizers uses the most observed symptoms). Vitamins is also a correct medicine, but is suboptimal as it uses fewer observed symptoms. All other medicines are incorrect.
Figure 2: Pareto plots: top left is desired (higher accuracy, lower response time per question). We plot the performance (mean and standard error) of our four conditions (No-AI, AI-before, AI-after and mixed) under no time pressure (bold lines) and under time pressure (dotted lines). We see that, under time pressure, all conditions become quicker, but AI-before becomes much quicker than the others, while keeping similar accuracy as the other AI assistance conditions. No-AI reduces accuracy under time pressure. See \ref{['table:expt2_results']} for values.
Figure 3: Pareto plots: top left is desired (higher accuracy, lower response time per question). We compare the performance (mean and standard error) of the AI assistances in the mixed condition (dotted lines) to the pure AI assistance-only conditions (bold lines). We see that, under no time pressure (left), mixed AI-before is marginally faster than pure AI-before (with significantly higher overreliance, see main text), and mixed AI-after is marginally slower than pure AI-after. Under time pressure (right), mixed AI-before now has the same response time as pure AI-before, and mixed No-AI has higher accuracy than pure No-AI.
Figure 4: Pareto plots: top left is desired (higher accuracy, lower response time per question). We compare the performance (mean and standard error) of the AI assistances in the mixed condition on easy questions (bold lines) to hard questions (dotted lines) under time pressure. We see that for overreliers (left) on hard questions, there is a tradeoff between AI-before and AI-after, and for not-overreliers (right) there is a marginal tradeoff on easy questions. Otherwise, AI-before is better (equal accuracy and quicker time compared to AI-after and No-AI).
Figure 5: The alien prescription task, where participants must prescribe a single medicine. In this example, the timers are shown on the screen (the global timer counts down from 20 minutes, as it is part of Experiment 2 (\ref{['sec:experiment2']})). This question is a hard-difficulty question, whereas the question in \ref{['fig:alien_example']} was easy-difficulty. It is hard because the optimal medicine (which, in this case, the AI recommends correctly to be 'tranquilizers') uses fewer observed symptoms than the other medicines. Therefore, if a participant wants to confirm that 'tranquilizers' is the optimal medicine, they have to check many other medicines too ('optimal' is defined as the medicine that uses the most observed symptoms while not using/treating any unobserved symptoms). For this alien, vitamins is also a correct medicine, but it is suboptimal. All other medicines are incorrect.

Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure

TL;DR

Abstract

Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure

Authors

TL;DR

Abstract

Table of Contents

Figures (5)