A Bayesian Model for Multi-stage Censoring
Shuvom Sadhuka, Sophia Lin, Bonnie Berger, Emma Pierson
TL;DR
The paper tackles bias from multi-stage censoring in funnel decision pipelines, particularly when ground-truth outcomes are only observed for a subset of patients. It proposes a Bayesian funnel model that jointly infers the ground-truth label and the human censoring decisions across $K$ stages, using a discriminant-distribution parameterization for the stage-risk $p_{i,k}$ with mean $\phi_{i,k}=f_\beta(X_{i,k})$ and shape $\delta_k$, and thresholds $t_k$ to define transition probabilities $A_{i,k \rightarrow m}$. The model is fitted via Stan with MCMC on synthetic data and a real-world MIMIC-IV ED case study, demonstrating improved parameter recovery, risk prediction (AUROC) and calibration over baselines, and revealing gender-based differences in hospital and ICU admission thresholds (e.g., $t_{hosp,F}>t_{hosp,M}$, $t_{ICU,F}>t_{ICU,M}$). The findings highlight the importance of accounting for sequential censoring when estimating risk and planning resource allocation in healthcare, with potential applicability to other funnel-like decision processes.
Abstract
Many sequential decision settings in healthcare feature funnel structures characterized by a series of stages, such as screenings or evaluations, where the number of patients who advance to each stage progressively decreases and decisions become increasingly costly. For example, an oncologist may first conduct a breast exam, followed by a mammogram for patients with concerning exams, followed by a biopsy for patients with concerning mammograms. A key challenge is that the ground truth outcome, such as the biopsy result, is only revealed at the end of this funnel. The selective censoring of the ground truth can introduce statistical biases in risk estimation, especially in underserved patient groups, whose outcomes are more frequently censored. We develop a Bayesian model for funnel decision structures, drawing from prior work on selective labels and censoring. We first show in synthetic settings that our model is able to recover the true parameters and predict outcomes for censored patients more accurately than baselines. We then apply our model to a dataset of emergency department visits, where in-hospital mortality is observed only for those who are admitted to either the hospital or ICU. We find that there are gender-based differences in hospital and ICU admissions. In particular, our model estimates that the mortality risk threshold to admit women to the ICU is higher for women (5.1%) than for men (4.5%).
