Table of Contents
Fetching ...

A Bayesian Model for Multi-stage Censoring

Shuvom Sadhuka, Sophia Lin, Bonnie Berger, Emma Pierson

TL;DR

The paper tackles bias from multi-stage censoring in funnel decision pipelines, particularly when ground-truth outcomes are only observed for a subset of patients. It proposes a Bayesian funnel model that jointly infers the ground-truth label and the human censoring decisions across $K$ stages, using a discriminant-distribution parameterization for the stage-risk $p_{i,k}$ with mean $\phi_{i,k}=f_\beta(X_{i,k})$ and shape $\delta_k$, and thresholds $t_k$ to define transition probabilities $A_{i,k \rightarrow m}$. The model is fitted via Stan with MCMC on synthetic data and a real-world MIMIC-IV ED case study, demonstrating improved parameter recovery, risk prediction (AUROC) and calibration over baselines, and revealing gender-based differences in hospital and ICU admission thresholds (e.g., $t_{hosp,F}>t_{hosp,M}$, $t_{ICU,F}>t_{ICU,M}$). The findings highlight the importance of accounting for sequential censoring when estimating risk and planning resource allocation in healthcare, with potential applicability to other funnel-like decision processes.

Abstract

Many sequential decision settings in healthcare feature funnel structures characterized by a series of stages, such as screenings or evaluations, where the number of patients who advance to each stage progressively decreases and decisions become increasingly costly. For example, an oncologist may first conduct a breast exam, followed by a mammogram for patients with concerning exams, followed by a biopsy for patients with concerning mammograms. A key challenge is that the ground truth outcome, such as the biopsy result, is only revealed at the end of this funnel. The selective censoring of the ground truth can introduce statistical biases in risk estimation, especially in underserved patient groups, whose outcomes are more frequently censored. We develop a Bayesian model for funnel decision structures, drawing from prior work on selective labels and censoring. We first show in synthetic settings that our model is able to recover the true parameters and predict outcomes for censored patients more accurately than baselines. We then apply our model to a dataset of emergency department visits, where in-hospital mortality is observed only for those who are admitted to either the hospital or ICU. We find that there are gender-based differences in hospital and ICU admissions. In particular, our model estimates that the mortality risk threshold to admit women to the ICU is higher for women (5.1%) than for men (4.5%).

A Bayesian Model for Multi-stage Censoring

TL;DR

The paper tackles bias from multi-stage censoring in funnel decision pipelines, particularly when ground-truth outcomes are only observed for a subset of patients. It proposes a Bayesian funnel model that jointly infers the ground-truth label and the human censoring decisions across stages, using a discriminant-distribution parameterization for the stage-risk with mean and shape , and thresholds to define transition probabilities . The model is fitted via Stan with MCMC on synthetic data and a real-world MIMIC-IV ED case study, demonstrating improved parameter recovery, risk prediction (AUROC) and calibration over baselines, and revealing gender-based differences in hospital and ICU admission thresholds (e.g., , ). The findings highlight the importance of accounting for sequential censoring when estimating risk and planning resource allocation in healthcare, with potential applicability to other funnel-like decision processes.

Abstract

Many sequential decision settings in healthcare feature funnel structures characterized by a series of stages, such as screenings or evaluations, where the number of patients who advance to each stage progressively decreases and decisions become increasingly costly. For example, an oncologist may first conduct a breast exam, followed by a mammogram for patients with concerning exams, followed by a biopsy for patients with concerning mammograms. A key challenge is that the ground truth outcome, such as the biopsy result, is only revealed at the end of this funnel. The selective censoring of the ground truth can introduce statistical biases in risk estimation, especially in underserved patient groups, whose outcomes are more frequently censored. We develop a Bayesian model for funnel decision structures, drawing from prior work on selective labels and censoring. We first show in synthetic settings that our model is able to recover the true parameters and predict outcomes for censored patients more accurately than baselines. We then apply our model to a dataset of emergency department visits, where in-hospital mortality is observed only for those who are admitted to either the hospital or ICU. We find that there are gender-based differences in hospital and ICU admissions. In particular, our model estimates that the mortality risk threshold to admit women to the ICU is higher for women (5.1%) than for men (4.5%).

Paper Structure

This paper contains 23 sections, 17 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Mechanics of the model. (Left) At each stage, the decision-maker estimates the mortality risk of a patient, given some covariates (e.g., blood pressure, or BP). If the drawn risk is larger than the threshold to hospitalize (or admit to ICU), the decision-maker moves them onto the appropriate stage. New variables, such as X-rays, may be collected at later stages, which are used to update the decision-maker's risk estimate. The ground truth is only observed for patients who are admitted to the hospital or ICU. (Right) A graphical representation of the patient flows. At each stage, $X$ represents the set of covariates available (e.g., BP and X-ray), while $D$ is the decision made for that patient at a particular stage (e.g., $D_{ED}$ could be discharge, admit to hospital, or admit to ICU). $Y$ is unobserved for patients discharged from the ED.
  • Figure 2: Fitted coefficients in MIMIC for male (blue) and female (yellow) patients. Asterisks indicate statistically significant differences by gender in the fitted coefficients. There are notable differences in the relative importance of age and O2 saturation in predicting mortality risk across gender.
  • Figure 3: Calibration plots of threshold parameters, funnel model.
  • Figure 4: Calibration plots of logistic regression baselines and funnel model. Coverage is the empirical coverage of the 95% confidence intervals.
  • Figure 5: Our model is able to recover the true admit and mortality rates in MIMIC data.
  • ...and 1 more figures