Designing for Appropriate Reliance: The Roles of AI Uncertainty Presentation, Initial User Decision, and User Demographics in AI-Assisted Decision-Making
Shiye Cao, Anqi Liu, Chien-Ming Huang
TL;DR
This paper investigates how presenting AI uncertainty, a user's initial decision, and demographic factors shape appropriate reliance in AI-assisted decision-making. Using an online skin cancer screening task, it compares Baseline, Raw Probability, Calibrated Probability, and Calibrated Frequency uncertainty presentations, with calibration performed via beta calibration. Key findings show calibrated frequency representations improve users' ability to adjust reliance based on AI uncertainty and reduce confirmation bias, while calibration alone offers limited benefits and over-reliance persists overall. The results suggest a path toward personalized AI aids that tailor uncertainty presentation, initial decision context, and user demographics to optimize human-AI collaboration in critical decisions.
Abstract
Appropriate reliance is critical to achieving synergistic human-AI collaboration. For instance, when users over-rely on AI assistance, their human-AI team performance is bounded by the model's capability. This work studies how the presentation of model uncertainty may steer users' decision-making toward fostering appropriate reliance. Our results demonstrate that showing the calibrated model uncertainty alone is inadequate. Rather, calibrating model uncertainty and presenting it in a frequency format allow users to adjust their reliance accordingly and help reduce the effect of confirmation bias on their decisions. Furthermore, the critical nature of our skin cancer screening task skews participants' judgment, causing their reliance to vary depending on their initial decision. Additionally, step-wise multiple regression analyses revealed how user demographics such as age and familiarity with probability and statistics influence human-AI collaborative decision-making. We discuss the potential for model uncertainty presentation, initial user decision, and user demographics to be incorporated in designing personalized AI aids for appropriate reliance.
