Human-Aligned Calibration for AI-Assisted Decision Making

Nina L. Corvelo Benz; Manuel Gomez Rodriguez

Human-Aligned Calibration for AI-Assisted Decision Making

Nina L. Corvelo Benz, Manuel Gomez Rodriguez

TL;DR

It is argued that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values.

Abstract

Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicted label matches the ground truth label. However, multiple lines of empirical evidence suggest that decision makers have difficulties at developing a good sense on when to trust a prediction using these confidence values. In this paper, our goal is first to understand why and then investigate how to construct more useful confidence values. We first argue that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values. However, we then show that, if the confidence values satisfy a natural alignment property with respect to the decision maker's confidence on her own predictions, there always exists an optimal decision policy under which the level of trust the decision maker would need to place on predictions is monotone on the confidence values, facilitating its discoverability. Further, we show that multicalibration with respect to the decision maker's confidence on her own predictions is a sufficient condition for alignment. Experiments on four different AI-assisted decision making tasks where a classifier provides decision support to real human experts validate our theoretical results and suggest that alignment may lead to better decisions.

Human-Aligned Calibration for AI-Assisted Decision Making

TL;DR

Abstract

Paper Structure (17 sections, 12 theorems, 98 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 12 theorems, 98 equations, 6 figures, 1 table, 1 algorithm.

Introduction
A Causal Model of AI-Assisted Decision Making
Impossibility of AI-Assisted Decision Making Under Calibration
AI-Assisted Decision Making Under Human-Aligned Calibration
Achieving Human-Aligned Calibration via Multicalibration
Experiments
Discussion and Limitations
Conclusions
Proofs
Additional Lemmas
Proof of Theorem \ref{['th:suboptimal_monoAP']}
Proof of Theorem \ref{['th:optimal_monoAP']}
Proof of Theorem \ref{['th:multcal_to_alignment']}
Proof of Proposition \ref{['prop:multical_to_alignment']}
Proof Theorem \ref{['th:UMD_calibration']}
...and 2 more sections

Key Result

Theorem 3

There exist (infinitely many) AI-assisted decision making processes $\mathcal{M}$ satisfying Eqs. eq:scm-1 and eq:scm-2, with utility functions $u(T, Y)$ satisfying Eq. eq:utility_prop_new, such that $f_{B}$ is perfectly calibrated and $f_{H}$ is monotone but any AI-assisted decision policy $\pi \in

Figures (6)

Figure 1: Our structural causal model $\mathcal{M}$. Orange circles represent endogenous random variables and blue boxes represent exogenous random variables. The value of each endogenous variable is given by a function of the values of its ancestors in the structural causal model, as defined by Eqs. \ref{['eq:scm-1']} and \ref{['eq:scm-2']}. The value of each exogenous variable is sampled independently from a given distribution.
Figure 2: Empirical estimate of the probabilities $P(Y = 1 \,\mid\, (X,Y) \in {\mathcal{S}}_{h, \lambda(b)})$, where $b \in \Lambda[0,1]$ and $h \in \{\text{low},\text{mid},\text{high}\}$ are the discretized confidence values for the classifiers and human participants, respectively. Error bars represent $90$% confidence intervals and hatched bars mark alignment violations between confidence pairs $(h, b)$ with $|{\mathcal{S}}_{h, \lambda(b)}| \geq 30$.
Figure 3: Empirical estimate of the average difference $\mathbb{E}[h_{\text{+AI}} - h \,\mid\, (X, Y) \in {\mathcal{S}}_{h, \lambda(b)}]$, where $b \in \Lambda[0,1]$ and $h \in \{\text{low},\text{mid},\text{high}\}$ are the discretized confidence values for the classifier and human participants, respectively. Error bars represent $90$% confidence intervals and hatched bars mark alignment violations between confidence pairs $(h, b)$ with $|{\mathcal{S}}_{h, \lambda(b)}| \geq 30$.
Figure 4: Nonzero values of $P(Y = 1 | H=h_i, B=b_j)$ and $P(H=h_i, B=b_j)$ for every $h_i \in \mathcal{H}$ and $b_j \in \mathcal{B}$ used in the first (left) and second (right) part of the proof of Theorem \ref{['th:suboptimal_monoAP']}. In each cell $(h_i, b_j)$ in both panels, $P^{+}$ or $P^{-}$ is the value of $P(Y = 1 | H=h_i, B=b_j)$ and lighter color means lower value of $P(H=h_i, B=b_j)$, where white means $P(Y = 1 | h=h_i, B = b_j) = 0$ and $P(H,B)=0$. In both panels, the assignment of values is very stylized to facilitate the proof---the classifier's confidence function $f_B$ partitions the feature space in a way such that a rational decision maker is unable to take decisions that maximize utility for almost all confidence values. However, less stylized examples also satisfy the conditions of Lemma \ref{['lem:optimalpi']}. For example, as long as there is one triplet of confidence values $b_2, h_2, h_3$ (or $h_3, b_1, b_2$ in the left example) for which a rational decision maker is unable to take decisions that maximize utility, Lemma \ref{['lem:optimalpi']} can be applied.
Figure 5: Nonzero values of $P(Y = 1 | X, H=h_i, X \in I_j)$ for every $h_i \in \mathcal{H}$, with $|\mathcal{H}|=3$, and $I_j = (q_{j-1}, q_j]$, with $q_j \in Q_{4}$ used in the last part of the proof of Theorem \ref{['th:suboptimal_monoAP']}. Lighter color means lower value of $f^-$ or $f^+$.
...and 1 more figures

Theorems & Definitions (20)

Definition 1: Monotone AI-assisted decision policy
Definition 2: Calibration
Theorem 3
Definition 4: Human-alignment
Theorem 5
Corollary 1
Definition 6: Human-aligned calibration
Definition 7: Multicalibration
Theorem 8
Proposition 1
...and 10 more

Human-Aligned Calibration for AI-Assisted Decision Making

TL;DR

Abstract

Human-Aligned Calibration for AI-Assisted Decision Making

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (20)