Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

Shao Zhang; Jianing Yu; Xuhai Xu; Changchang Yin; Yuxuan Lu; Bingsheng Yao; Melanie Tory; Lace M. Padilla; Jeffrey Caterino; Ping Zhang; Dakuo Wang

Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

Shao Zhang, Jianing Yu, Xuhai Xu, Changchang Yin, Yuxuan Lu, Bingsheng Yao, Melanie Tory, Lace M. Padilla, Jeffrey Caterino, Ping Zhang, Dakuo Wang

TL;DR

This work examines the gap between high predictive performance of AI models and their real-world deployment in sepsis diagnosis. It introduces SepsisLab, a human-centered AI system that not only predicts current and near-term sepsis risk but also visualizes uncertainty and recommends actionable lab tests to reduce ambiguity, thereby supporting intermediate decision-making stages such as hypothesis generation and data gathering. Grounded in a formative study with clinicians who critique the existing Epic Sepsis Module, SepsisLab reframes AI as a collaborator rather than a competitor and demonstrates improved perceived collaboration, transparency, and utility. The findings suggest that shifting AI focus to intermediate decision-support tasks can generalize to other high-stakes, time-sensitive domains, offering practical guidance for deploying trustworthy AI-CDSS in complex clinical workflows.

Abstract

Today's AI systems for medical decision support often succeed on benchmark datasets in research papers but fail in real-world deployment. This work focuses on the decision making of sepsis, an acute life-threatening systematic infection that requires an early diagnosis with high uncertainty from the clinician. Our aim is to explore the design requirements for AI systems that can support clinical experts in making better decisions for the early diagnosis of sepsis. The study begins with a formative study investigating why clinical experts abandon an existing AI-powered Sepsis predictive module in their electrical health record (EHR) system. We argue that a human-centered AI system needs to support human experts in the intermediate stages of a medical decision-making process (e.g., generating hypotheses or gathering data), instead of focusing only on the final decision. Therefore, we build SepsisLab based on a state-of-the-art AI algorithm and extend it to predict the future projection of sepsis development, visualize the prediction uncertainty, and propose actionable suggestions (i.e., which additional laboratory tests can be collected) to reduce such uncertainty. Through heuristic evaluation with six clinicians using our prototype system, we demonstrate that SepsisLab enables a promising human-AI collaboration paradigm for the future of AI-assisted sepsis diagnosis and other high-stakes medical decision making.

Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

TL;DR

Abstract

Paper Structure (42 sections, 4 figures, 3 tables)

This paper contains 42 sections, 4 figures, 3 tables.

Introduction
Background and Related Work
Sepsis Diagnosis
Challenges of AI-empowered Clinical Decision-Making Support
AI-supported Clinical Decision Support Systems Design
Formative study: Current Practices and Challenges of AI-assisted Sepsis Diagnosis
Method
Result
Belated Sepsis Risk Prediction
Inaccurate Sepsis Risk Prediction
Lack of Explanations
No Actionable Insights
AI Helper or AI Challenger?
Summary of Results
SepsisLab: a Human-Centered AI System to Support Early Diagnosis of Sepsis
...and 27 more sections

Figures (4)

Figure 1: Existing Human-AI Interaction and "Competition" Paradigm. The current sepsis module mainly focuses on supporting the final decision-making stage sox2013medical, yet physicians often find the AI predictions are too late and not helpful.
Figure 2: The Clinician's Medical Decision-Making Workflow with Support from SepsisLab. SepsisLab focuses on providing support to the intermediate steps of the clinical experts' decision-making process sox2013medical, as opposed to existing AI modules that focus only on the final decision-making stage. SepsisLab can generate predictions for the patient's sepsis onset possibility (as the risk score) now and in the future (Design Strategy 1, Design Strategy 4), as shown in Step 1; It can further suggest additional lab tests by their impact on model uncertainty (Design Strategy 2), and the interactive visualization can help clinicians select the most valuable lab tests to support their decision (Design Strategy 3, Design Strategy 4), as shown in Step 2; Once new data are collected, the prediction visualization will be updated (Step 3), helping clinicians test hypotheses. Then, following our Design Strategy 5, clinicians can generate new hypotheses or reach final decisions (Step 4).
Figure 3: User Interface of Our Prototype System. (A) A list of patients with different sepsis risk prediction scores, colored from no risk as Green, to medium risk as Yellow, to high risk as Red. (B) The patient's demographics and the dashboard that includes the patient's vital signs, lab test results, and medical history. (C) Our SepsisLab system as an add-on to the existing EHR system. This UI currently illustrates that a clinical expert is examining a high-risk patient's data who was admitted 15 hours ago. The AI suggests the expert collect more lab results. The expert is interacting with the visualization to see if Lactate and WBC lab results were added, how the sepsis prediction and its uncertainty would change. All patient names and demographic information in this screen capture are random generated fake data for illustration purposes.
Figure 4: The Interactive Lab Test Recommendation Module in SepsisLab. (a) The clinician can get an actionable lab item test recommendation list from SepsisLab. The items are ranked by their importance to reduce the uncertainty of the sepsis future prediction. (b) The clinician can interact with SepsisLab to select a lab item and see its expected influence of the lab test result on the model uncertainty via a counterfactual prediction. (c) The clinician can select multiple lab items and see their combined expected influence of the results on the uncertainty.

Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

TL;DR

Abstract

Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

Authors

TL;DR

Abstract

Table of Contents

Figures (4)