Using AI Uncertainty Quantification to Improve Human Decision-Making
Laura R. Marusich, Jonathan Z. Bakdash, Yan Zhou, Murat Kantarcioglu
TL;DR
This work probes whether well-calibrated instance-level AI Uncertainty Quantification (UQ) can enhance human decision-making beyond AI predictions. It introduces a sampling-based UQ method with ground-truth calibration verified via a strict scoring rule and evaluates its effect through two preregistered online experiments across multiple datasets. Experiment 1 shows that AI UQ improves decision accuracy and confidence calibration over AI predictions alone, while Experiment 2 finds no robust differences across different uncertainty visualizations or representations. Collectively, the results support the value of high-quality, instance-level UQ for human-AI interaction and highlight that benefits generalize across representations, offering a path toward more reliable AI-assisted decision-making in real systems.
Abstract
AI Uncertainty Quantification (UQ) has the potential to improve human decision-making beyond AI predictions alone by providing additional probabilistic information to users. The majority of past research on AI and human decision-making has concentrated on model explainability and interpretability, with little focus on understanding the potential impact of UQ on human decision-making. We evaluated the impact on human decision-making for instance-level UQ, calibrated using a strict scoring rule, in two online behavioral experiments. In the first experiment, our results showed that UQ was beneficial for decision-making performance compared to only AI predictions. In the second experiment, we found UQ had generalizable benefits for decision-making across a variety of representations for probabilistic information. These results indicate that implementing high quality, instance-level UQ for AI may improve decision-making with real systems compared to AI predictions alone.
