Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

Hussein Mozannar; Yuria Utsumi; Irene Y. Chen; Stephanie S. Gervasi; Michele Ewing; Aaron Smith-McLallen; David Sontag

Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

Hussein Mozannar, Yuria Utsumi, Irene Y. Chen, Stephanie S. Gervasi, Michele Ewing, Aaron Smith-McLallen, David Sontag

TL;DR

This work tackles the challenge of timely identification and risk stratification in high-risk pregnancy care by deploying a real-world ML system that identifies pregnancy episodes in near real-time and predicts risk of gestational hypertension and diabetes. It introduces HAPI, a hybrid algorithm that fuses anchor-based signals with a Lasso model, and a calibrated, explainable risk classifier, both surfaced through a nurse-facing UI. Validated on over 30k patients, the approach achieves an AUROC around 0.76 for complication risk and demonstrates improved nurse decision-making in user studies, underscoring the value of human-centered design in clinical ML deployments. The study highlights practical benefits for care management, while addressing fairness and privacy considerations and outlining future work in transfer learning and continual validation.

Abstract

A high-risk pregnancy is a pregnancy complicated by factors that can adversely affect the outcomes of the mother or the infant. Health insurers use algorithms to identify members who would benefit from additional clinical support. This work presents the implementation of a real-world ML-based system to assist care managers in identifying pregnant patients at risk of complications. In this retrospective evaluation study, we developed a novel hybrid-ML classifier to predict whether patients are pregnant and trained a standard classifier using claims data from a health insurance company in the US to predict whether a patient will develop pregnancy complications. These models were developed in cooperation with the care management team and integrated into a user interface with explanations for the nurses. The proposed models outperformed commonly used claim codes for the identification of pregnant patients at the expense of a manageable false positive rate. Our risk complication classifier shows that we can accurately triage patients by risk of complication. Our approach and evaluation are guided by human-centric design. In user studies with the nurses, they preferred the proposed models over existing approaches.

Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

TL;DR

Abstract

Paper Structure (27 sections, 10 figures, 11 tables, 3 algorithms)

This paper contains 27 sections, 10 figures, 11 tables, 3 algorithms.

Introduction
Objective.
Contributions.
Related Work
Methods
Dataset Creation For Pregnancy Start and End Identification
Algorithm For Pregnancy Start and End Identification
Dataset Creation for Pregnancy Complication Prediction
Extracting Evidence for Predictions
User Study Design
Statistical analysis
Results
Identifying Pregnancies From Claims Data
Predicting Pregnancy Complications
Bias/Fairness Audit For Pregnancy Complication Classifier
...and 12 more sections

Figures (10)

Figure 1: Illustration of our proposed algorithm HAPI for pregnancy identification. We first collect a historical dataset of members that is used to train the Lasso model that predicts the probability of members being pregnant. Then, at each point in time t in the patient's trajectory (weekly frequency), we pass their claim codes through HAPI which combines the Lasso model and the list of anchor pregnancy codes to obtain a probability of a member being pregnant. We visualize on the rightmost graph the probability of pregnancy during the member's gestation, where we also show the first instance where there is a code indicating pregnancy start compared to when HAPI predicted pregnancy start.
Figure 2: Patient dashboard sketch for the user study on pregnancy complications classification. The user interface consists of a left panel containing demographic information and two views: Overview and Visits. We show the subtab Diseases/Conditions from the overview view where the nurse can find the ICD codes for each condition and disease. On the left panel, patient information is shown, the model prediction, and history of prior complications. We color ICD codes positively associated with red complications (intensity varies with correlation) and those negatively associated with complications with green.
Figure 3: Histogram of pregnancy identification delays for pregnancies with complications for HAPI compared to the anchor codes. We measure the difference of days between the predicted start date and actual start date for our model HAPI compared to a set of predefined pregnancy start codes (anchor codes). In subfigure (a) we show the histogram of differences in all the test patients and we can see that the two distributions overlap. However, in subfigure (b) when we look at the subset of the test patients where HAPI is earlier than the anchor codes ( 3.54% of the set) we see that HAPI is earlier than the anchor codes.
Figure 4: Accuracy and AUROC of the Lasso pregnancy complication predictor as we predict later during pregnancy duration. For each time of prediction, we trim patient data until the time of prediction, we then predict using the trimmed patient data for each time. We plot the linear trend line of the accuracy and AUROC which are shown to be increasing over time, error bars represent 95% CI.
Figure 5: Illustration of the pregnancy cohort selection algorithm (\ref{['alg:build_preg_cohort']}). First, the most recent pregnancy outcome is detected (red point), referencing outcome codes defined in matcho_inferring_2018. Then, we search for pregnancy start code(s) (blue point(s)) within a specified lookback window for the corresponding outcome matcho_inferring_2018 (blue brackets); the earliest start code marks the start of that pregnancy episode. Finally, we do a forward search for any additional pregnancy outcome or complications, referencing additional outcome codes compiled internally at AIC (orange point); if one exists, the pregnancy outcome is updated. Member B is excluded from the cohort since no pregnancy start code was detected within the lookback window. Member C is excluded since there was no associated pregnancy outcome code; amenorrhea alone cannot indicate pregnancy has started since it can be caused by non-pregnancy-related factors (e.g. stress, menopause).
...and 5 more figures

Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

TL;DR

Abstract

Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

Authors

TL;DR

Abstract

Table of Contents

Figures (10)