Table of Contents
Fetching ...

AI Reliance and Decision Quality: Fundamentals, Interdependence, and the Effects of Interventions

Jakob Schoeffer, Johannes Jakubik, Michael Voessing, Niklas Kuehl, Gerhard Satzger

TL;DR

This work disentangles reliance behavior from final decision quality in AI-assisted decision-making by formulating a formal interdependence between human adherence and AI accuracy. It introduces a visual rectangle framework that maps adherence level $\mathcal{A}$ and final accuracy $Acc_{final}$, clarifying when human-AI complementarity is achievable and how interventions can differentially affect reliance quantity versus quality. The authors provide analytical results for attainable accuracy ranges, a quality-of-reliance metric $Q$, and practical tools, including an urn-based method and $F_{\beta}$-score perspectives, to interpret and compare empirical studies of explanations and other decision-support interventions. They demonstrate the framework on empirical studies, emphasizing the need to measure both reliance behavior and decision quality to avoid misinterpreting intervention effects. The work offers a blueprint for evaluating and designing interventions that genuinely enhance decision quality in AI-assisted systems.

Abstract

In AI-assisted decision-making, a central promise of having a human-in-the-loop is that they should be able to complement the AI system by overriding its wrong recommendations. In practice, however, we often see that humans cannot assess the correctness of AI recommendations and, as a result, adhere to wrong or override correct advice. Different ways of relying on AI recommendations have immediate, yet distinct, implications for decision quality. Unfortunately, reliance and decision quality are often inappropriately conflated in the current literature on AI-assisted decision-making. In this work, we disentangle and formalize the relationship between reliance and decision quality, and we characterize the conditions under which human-AI complementarity is achievable. To illustrate how reliance and decision quality relate to one another, we propose a visual framework and demonstrate its usefulness for interpreting empirical findings, including the effects of interventions like explanations. Overall, our research highlights the importance of distinguishing between reliance behavior and decision quality in AI-assisted decision-making.

AI Reliance and Decision Quality: Fundamentals, Interdependence, and the Effects of Interventions

TL;DR

This work disentangles reliance behavior from final decision quality in AI-assisted decision-making by formulating a formal interdependence between human adherence and AI accuracy. It introduces a visual rectangle framework that maps adherence level and final accuracy , clarifying when human-AI complementarity is achievable and how interventions can differentially affect reliance quantity versus quality. The authors provide analytical results for attainable accuracy ranges, a quality-of-reliance metric , and practical tools, including an urn-based method and -score perspectives, to interpret and compare empirical studies of explanations and other decision-support interventions. They demonstrate the framework on empirical studies, emphasizing the need to measure both reliance behavior and decision quality to avoid misinterpreting intervention effects. The work offers a blueprint for evaluating and designing interventions that genuinely enhance decision quality in AI-assisted systems.

Abstract

In AI-assisted decision-making, a central promise of having a human-in-the-loop is that they should be able to complement the AI system by overriding its wrong recommendations. In practice, however, we often see that humans cannot assess the correctness of AI recommendations and, as a result, adhere to wrong or override correct advice. Different ways of relying on AI recommendations have immediate, yet distinct, implications for decision quality. Unfortunately, reliance and decision quality are often inappropriately conflated in the current literature on AI-assisted decision-making. In this work, we disentangle and formalize the relationship between reliance and decision quality, and we characterize the conditions under which human-AI complementarity is achievable. To illustrate how reliance and decision quality relate to one another, we propose a visual framework and demonstrate its usefulness for interpreting empirical findings, including the effects of interventions like explanations. Overall, our research highlights the importance of distinguishing between reliance behavior and decision quality in AI-assisted decision-making.
Paper Structure (27 sections, 10 theorems, 14 equations, 10 figures, 3 tables)

This paper contains 27 sections, 10 theorems, 14 equations, 10 figures, 3 tables.

Key Result

Proposition 1

For $n\rightarrow\infty$, a given AI accuracy $Acc_{AI}$, and a degree of adherence to AI recommendations, $\mathcal{A}$, the range of attainable decision-making accuracy $Acc_{final}$ is

Figures (10)

  • Figure 1: We consider concurrent AI-assisted decision-making where a human-in-the-loop receives an AI recommendation that can either be correct (✓) or wrong (✗). The human can either adhere to (bordered circle) or override (no border) the AI recommendation. When the human adheres to a correct or overrides a wrong AI recommendation, the final decision will be correct (cases (a) and (c)); in the remaining cases, it will be wrong (cases (b) and (d)). The correctness of the final decision is indicated by either blue (correct) or orange (wrong) shading.
  • Figure 2: Two paradigms in AI-assisted decision-making: (a) sequential and (b) concurrent, taken from ? (? ). The focus of this work is on concurrent AI-assisted decision-making.
  • Figure 3: Possible scenarios of reliance behavior and associated decision-making accuracy, given an AI accuracy of $Acc_{AI}=70\%$ and an adherence level of $\mathcal{A}=70\%$. Correct AI recommendations (✓) and wrong AI recommendation (✗) are separated by a dashed line.
  • Figure 4: The area of attainable decision-making accuracy for a given AI accuracy of (a) 70% and (b) 90%, and different levels of human adherence. The orange area indicates $Acc_{final}<Acc_{AI}$; blue indicates $Acc_{final}>Acc_{AI}$; the green dashed line indicates the level of adherence where $Acc_{final}=100\%$ is attainable; the black line indicates the expected value of $Acc_{final}$ when the human-in-the-loop cannot discern correct and wrong.
  • Figure 5: Visualizing the effects of two different interventions (blue $\bullet$ and purple $\bullet$) on reliance behavior and decision-making accuracy.
  • ...and 5 more figures

Theorems & Definitions (13)

  • Definition 1: Human-in-the-loop
  • Definition 2: AI reliance
  • Definition 3: Complementarity
  • Proposition 1
  • Corollary 1
  • Proposition 2
  • Proposition 3
  • Corollary 2
  • Corollary 3
  • Corollary 4
  • ...and 3 more