Explainable AI Enhances Glaucoma Referrals, Yet the Human-AI Team Still Falls Short of the AI Alone
Catalina Gomez, Ruolin Wang, Katharina Breininger, Corinne Casey, Chris Bradley, Mitchell Pavlak, Alex Pham, Jithin Yohannan, Mathias Unberath
TL;DR
This work addresses how explainable AI can assist primary eye care providers in triaging glaucoma referrals to specialists. It develops both a black-box DLM and an intrinsically interpretable WoE-based scorecard, paired with an optometrist-focused human-AI study using four interaction conditions to map AI outputs to urgent, near-term, or no referrals. Key findings show AI support improves referral accuracy over unaided decisions, yet AI-alone performance remains higher than human-AI teams, with scoring-based explanations offering the strongest signal for perceived usefulness and deployment willingness. The results highlight the potential and current limitations of human-AI collaboration in glaucoma care, underscoring the need for user-centered design and further research to safely integrate AI into primary care workflows.
Abstract
Primary care providers are vital for initial triage and referrals to specialty care. In glaucoma, asymptomatic and fast progression can lead to vision loss, necessitating timely referrals to specialists. However, primary eye care providers may not identify urgent cases, potentially delaying care. Artificial Intelligence (AI) offering explanations could enhance their referral decisions. We investigate how various AI explanations help providers distinguish between patients needing immediate or non-urgent specialist referrals. We built explainable AI algorithms to predict glaucoma surgery needs from routine eyecare data as a proxy for identifying high-risk patients. We incorporated intrinsic and post-hoc explainability and conducted an online study with optometrists to assess human-AI team performance, measuring referral accuracy and analyzing interactions with AI, including agreement rates, task time, and user experience perceptions. AI support enhanced referral accuracy among 87 participants (59.9%/50.8% with/without AI), though Human-AI teams underperformed compared to AI alone. Participants believed they included AI advice more when using the intrinsic model, and perceived it more useful and promising. Without explanations, deviations from AI recommendations increased. AI support did not increase workload, confidence, and trust, but reduced challenges. On a separate test set, our black-box and intrinsic models achieved an accuracy of 77% and 71%, respectively, in predicting surgical outcomes. We identify opportunities of human-AI teaming for glaucoma management in primary eye care, noting that while AI enhances referral accuracy, it also shows a performance gap compared to AI alone, even with explanations. Human involvement remains essential in medical decision making, underscoring the need for future research to optimize collaboration, ensuring positive experiences and safe AI use.
