Table of Contents
Fetching ...

Handling missing values in clinical machine learning: Insights from an expert study

Lena Stempfle, Arthur James, Julie Josse, Tobias Gauss, Fredrik D. Johansson

TL;DR

The paper investigates how clinicians handle missing values at test time in interpretable clinical machine learning, using a real-world hemorrhagic shock use case. It gathers qualitative data from a large French trauma-clinician survey and quantitatively analyzes attitudes toward three IML approaches (risk scores, linear models, and decision trees) under missing data. Key findings show clinicians prefer models that handle missingness natively, rely on observed data and medical intuition, and favor transparent, interpretable imputation approaches over black-box methods. The study highlights the need to embed clinical reasoning into model design (e.g., native missingness handling like MIA), co-design with clinicians, and explicit communication of uncertainty to enhance adoption in clinical workflows.

Abstract

Inherently interpretable machine learning (IML) models offer valuable support for clinical decision-making but face challenges when features contain missing values. Traditional approaches, such as imputation or discarding incomplete records, are often impractical in scenarios where data is missing at test time. We surveyed 55 clinicians from 29 French trauma centers, collecting 20 complete responses to study their interaction with three IML models in a real-world clinical setting for predicting hemorrhagic shock with missing values. Our findings reveal that while clinicians recognize the value of interpretability and are familiar with common IML approaches, traditional imputation techniques often conflict with their intuition. Instead of imputing unobserved values, they rely on observed features combined with medical intuition and experience. As a result, methods that natively handle missing values are preferred. These findings underscore the need to integrate clinical reasoning into future IML models to enhance human-computer interaction.

Handling missing values in clinical machine learning: Insights from an expert study

TL;DR

The paper investigates how clinicians handle missing values at test time in interpretable clinical machine learning, using a real-world hemorrhagic shock use case. It gathers qualitative data from a large French trauma-clinician survey and quantitatively analyzes attitudes toward three IML approaches (risk scores, linear models, and decision trees) under missing data. Key findings show clinicians prefer models that handle missingness natively, rely on observed data and medical intuition, and favor transparent, interpretable imputation approaches over black-box methods. The study highlights the need to embed clinical reasoning into model design (e.g., native missingness handling like MIA), co-design with clinicians, and explicit communication of uncertainty to enhance adoption in clinical workflows.

Abstract

Inherently interpretable machine learning (IML) models offer valuable support for clinical decision-making but face challenges when features contain missing values. Traditional approaches, such as imputation or discarding incomplete records, are often impractical in scenarios where data is missing at test time. We surveyed 55 clinicians from 29 French trauma centers, collecting 20 complete responses to study their interaction with three IML models in a real-world clinical setting for predicting hemorrhagic shock with missing values. Our findings reveal that while clinicians recognize the value of interpretability and are familiar with common IML approaches, traditional imputation techniques often conflict with their intuition. Instead of imputing unobserved values, they rely on observed features combined with medical intuition and experience. As a result, methods that natively handle missing values are preferred. These findings underscore the need to integrate clinical reasoning into future IML models to enhance human-computer interaction.

Paper Structure

This paper contains 27 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of study objectives, key survey question types, and findings on best practices for clinical interpretable machine learning with missing values at test time.
  • Figure 2: Clinicians were given a patient sample (top) and three interpretable ML models (middle: decision tree, logistic regression; bottom: risk score) predicting hemorrhagic shock. We assessed their interaction with the models with missing values.
  • Figure 3: Clinician preferences for imputation methods across different IML models. We normalize by dividing the number who chose a combination by the total, as the total votes for a model can vary.
  • Figure 4: Experimental setup. Clinicians are shown a data entry of a patient with 5 features whereas one feature is missing in an interpretable machine-learning model along with the questions. After the answers are gathered qualitative and quantitative methods are used to analyze the results.
  • Figure 5: Cohort's attitude on AI/ML. We show the mode and standard deviation for eight statements exploring perspectives on AI/ML tools in healthcare, covering familiarity with these technologies, beliefs regarding their potential to replace physicians, expectations about their added value and support, and whether these tools adequately represent physicians' work to be useful. Missing values were not yet discussed.
  • ...and 1 more figures