Detecting clinician implicit biases in diagnoses using proximal causal inference
Kara Liu, Russ Altman, Vasilis Syrgkanis
TL;DR
This work tackles the challenge of measuring clinician implicit bias in diagnostic decisions using large observational health data. It introduces proximal causal inference with health proxies and a partially linear bridge function $q$ to identify the direct bias effect $\theta$ of sociodemographic attributes on diagnosis, leveraging Neyman-orthogonal moments and a residual instrument $V=(\tilde{D}-\gamma^T\tilde{Z})$. Identification hinges on relaxations that make proximal mediation feasible in real-world data, with a proxy-selection algorithm ensuring valid proxies and multiple diagnostic tests assessing instrument strength. Empirical evaluation on semi-synthetic data and UK Biobank demonstrates detectable biases across several $(D,Y)$ pairs, robust performance under weak-instrument conditions, and insights into intersectionality and influential patient subgroups. The approach provides a practical bias-detection tool for data audits and informs efforts to reduce systemic discrimination in healthcare.
Abstract
Clinical decisions to treat and diagnose patients are affected by implicit biases formed by racism, ableism, sexism, and other stereotypes. These biases reflect broader systemic discrimination in healthcare and risk marginalizing already disadvantaged groups. Existing methods for measuring implicit biases require controlled randomized testing and only capture individual attitudes rather than outcomes. However, the "big-data" revolution has led to the availability of large observational medical datasets, like EHRs and biobanks, that provide the opportunity to investigate discrepancies in patient health outcomes. In this work, we propose a causal inference approach to detect the effect of clinician implicit biases on patient outcomes in large-scale medical data. Specifically, our method uses proximal mediation to disentangle pathway-specific effects of a patient's sociodemographic attribute on a clinician's diagnosis decision. We test our method on real-world data from the UK Biobank. Our work can serve as a tool that initiates conversation and brings awareness to unequal health outcomes caused by implicit biases.
