Ethical considerations of use of hold-out sets in clinical prediction model management

Louis Chislett; Louis JM Aslett; Alisha R Davies; Catalina A Vallejos; James Liley

Ethical considerations of use of hold-out sets in clinical prediction model management

Louis Chislett, Louis JM Aslett, Alisha R Davies, Catalina A Vallejos, James Liley

TL;DR

The paper addresses the ethical challenges of updating clinical prediction models (CPMs) when predictions trigger interventions that alter outcomes, i.e., performative prediction. It proposes hold-out sets with mutually exclusive training data $(X^H,Y^H)$ and intervention data $(X^I,Y^I)$ to retrain CPMs on data reflecting typical practice, yielding risk estimates for $Pr(Y=1|X)$ under standard care and mitigating contamination from post-intervention data. It analyzes sampling strategies—simple random, cluster randomised, and voluntary response—and evaluates them against beneficence, non-maleficence, autonomy, justice, informed consent, clinical equipoise, and truth-telling, with case studies to illustrate variation across settings. It further discusses hold-out sets as a tool to measure CPM effectiveness in addition to updating, using designs akin to randomized trials (e.g., Welsh stepped wedge). The authors conclude that ethical viability is setting-dependent and that hold-out sets may be necessary when causal data are lacking, but require robust protocols, stakeholder involvement, and careful balancing of individual and population-level considerations.

Abstract

Clinical prediction models are statistical or machine learning models used to quantify the risk of a certain health outcome using patient data. These can then inform potential interventions on patients, causing an effect called performative prediction: predictions inform interventions which influence the outcome they were trying to predict, leading to a potential underestimation of risk in some patients if a model is updated on this data. One suggested resolution to this is the use of hold-out sets, in which a set of patients do not receive model derived risk scores, such that a model can be safely retrained. We present an overview of clinical and research ethics regarding potential implementation of hold-out sets for clinical prediction models in health settings. We focus on the ethical principles of beneficence, non-maleficence, autonomy and justice. We also discuss informed consent, clinical equipoise, and truth-telling. We present illustrative cases of potential hold-out set implementations and discuss statistical issues arising from different hold-out set sampling methods. We also discuss differences between hold-out sets and randomised control trials, in terms of ethics and statistical issues. Finally, we give practical recommendations for researchers interested in the use hold-out sets for clinical prediction models.

Ethical considerations of use of hold-out sets in clinical prediction model management

TL;DR

and intervention data

to retrain CPMs on data reflecting typical practice, yielding risk estimates for

under standard care and mitigating contamination from post-intervention data. It analyzes sampling strategies—simple random, cluster randomised, and voluntary response—and evaluates them against beneficence, non-maleficence, autonomy, justice, informed consent, clinical equipoise, and truth-telling, with case studies to illustrate variation across settings. It further discusses hold-out sets as a tool to measure CPM effectiveness in addition to updating, using designs akin to randomized trials (e.g., Welsh stepped wedge). The authors conclude that ethical viability is setting-dependent and that hold-out sets may be necessary when causal data are lacking, but require robust protocols, stakeholder involvement, and careful balancing of individual and population-level considerations.

Abstract

Paper Structure (19 sections, 1 figure)

This paper contains 19 sections, 1 figure.

Introduction
Methods
Setting
Necessity of hold-out sets
Sampling hold-out sets
Ethical considerations
Beneficence:
Non-maleficence:
Autonomy:
Justice:
Informed consent:
Clinical equipoise:
Challenges to shared decision-making:
Case studies
Hold-out sets to measure the effectiveness of a CPM
...and 4 more sections

Figures (1)

Figure 1: Dynamics of a CPM trained once and updated twice using hold-out set methodology. Squares containing $X$ and $Y$ denote covariates and outcome respectively, with superscripts $I$ and $H$ denoting mutually exclusive intervention and hold-out sets.

Ethical considerations of use of hold-out sets in clinical prediction model management

TL;DR

Abstract

Ethical considerations of use of hold-out sets in clinical prediction model management

Authors

TL;DR

Abstract

Table of Contents

Figures (1)