Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

Zhongyuan Liang; Arvind Suresh; Irene Y. Chen

Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

Zhongyuan Liang, Arvind Suresh, Irene Y. Chen

TL;DR

This study investigates how treatment non-adherence in electronic health records biases clinical machine learning models for hypertension. It leverages a large language model to extract adherence signals from clinical notes in a 3,623-patient cohort, identifying 786 non-adherent individuals and revealing demographic, clinical, and patient-reported reasons for non-adherence. The authors demonstrate that ignoring adherence bias can reverse treatment effects in causal inference and reduce predictive performance by up to 5%, while simultaneously widening fairness disparities; they also show that removing non-adherent data can improve both accuracy and equity. The work illustrates a practical pipeline for incorporating adherence information into real-world clinical ML workflows and highlights the need for responsible, equitable modeling that accounts for treatment adherence bias across diseases.

Abstract

Machine learning systems trained on electronic health records (EHRs) increasingly guide treatment decisions, but their reliability depends on the critical assumption that patients follow the prescribed treatments recorded in EHRs. Using EHR data from 3,623 hypertension patients, we investigate how treatment non-adherence introduces implicit bias that can fundamentally distort both causal inference and predictive modeling. By extracting patient adherence information from clinical notes using a large language model (LLM), we identify 786 patients (21.7%) with medication non-adherence. We further uncover key demographic and clinical factors associated with non-adherence, as well as patient-reported reasons including side effects and difficulties obtaining refills. Our findings demonstrate that this implicit bias can not only reverse estimated treatment effects, but also degrade model performance by up to 5% while disproportionately affecting vulnerable populations by exacerbating disparities in decision outcomes and model error rates. This highlights the importance of accounting for treatment non-adherence in developing responsible and equitable clinical machine learning systems.

Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

TL;DR

Abstract

Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)