Table of Contents
Fetching ...

When to repeat a biomarker test? Decomposing sources of variation from conditionally repeated measurements

Supun Manathunga, Mart P. Janssen, Yu Luo, W. Alton Russell, Mart Pothast

TL;DR

This work tackles when to repeat a biomarker test under conditionally observed measurements by decomposing observed variation into population and measurement components, with $\sigma_{ m tot}^2 = \sigma_{ m pop}^2 + \sigma_{ m meas}^2$. It first develops two frequentist methods under normality (CE and ML) to estimate these components, showing sensitivity to normality and potential biases in real Hb data. To address non-normality and heavy tails, it introduces a Bayesian hierarchical framework with four model classes (Normal–Normal, Normal–t, Normal–mixture, Skew–t) and uses cross-validated marginal LPPD to select among them, ultimately adopting Normal–t for parity and interpretability. Applied to a large blood donor Hb dataset, the Bayesian approach yields robust estimates of population and measurement variability (e.g., $s \,\approx\,0.36$ g/dL; female $\\sigma_{ m pop}^2 \,\\approx\,1.13$, male $\\sigma_{ m pop}^2 \,\\approx\,1.63$), and quantifies posterior misclassification risks for single or repeated measurements, informing evidence-based conditional retesting rules and practical decision-making in donor eligibility.

Abstract

Repeating an imperfect biomarker test based on an initial result can introduce bias and influence misclassification risk. For example, in some blood donation settings, blood donors' hemoglobin is remeasured when the initial measurement falls below a minimum threshold for donor eligibility. This paper explores methods that use data resulting from processes with conditionally repeated biomarker measurement to decompose the variation in observed measurements of a continuous biomarker into population variability and variability arising from the measurement procedure. We present two frequentist approaches with analytical solutions, but these approaches perform poorly in a dataset of conditionally repeated blood donor hemoglobin measurements where normality assumptions are not met. We then develop a Bayesian hierarchical framework that allows for different distributional assumptions, which we apply to the blood donor hemoglobin dataset. Using a Bayesian hierarchical model that assumes normally distributed population hemoglobin and heavy tailed $t$-distributed measurement variation, we found that the total measurement variation accounted for 22\% of the total variance among females and 25\% among males, with population standard deviations of $1.07\, \rm g/dL$ for female donors and $1.28\, \rm g/dL$ for male donors. Our Bayesian framework can use data resulting from any clinical process with conditionally repeated biomarker measurements to estimate individuals' misclassification risk after one or more noisy continuous measurements and inform evidence-based conditional retesting decision rules.

When to repeat a biomarker test? Decomposing sources of variation from conditionally repeated measurements

TL;DR

This work tackles when to repeat a biomarker test under conditionally observed measurements by decomposing observed variation into population and measurement components, with . It first develops two frequentist methods under normality (CE and ML) to estimate these components, showing sensitivity to normality and potential biases in real Hb data. To address non-normality and heavy tails, it introduces a Bayesian hierarchical framework with four model classes (Normal–Normal, Normal–t, Normal–mixture, Skew–t) and uses cross-validated marginal LPPD to select among them, ultimately adopting Normal–t for parity and interpretability. Applied to a large blood donor Hb dataset, the Bayesian approach yields robust estimates of population and measurement variability (e.g., g/dL; female , male ), and quantifies posterior misclassification risks for single or repeated measurements, informing evidence-based conditional retesting rules and practical decision-making in donor eligibility.

Abstract

Repeating an imperfect biomarker test based on an initial result can introduce bias and influence misclassification risk. For example, in some blood donation settings, blood donors' hemoglobin is remeasured when the initial measurement falls below a minimum threshold for donor eligibility. This paper explores methods that use data resulting from processes with conditionally repeated biomarker measurement to decompose the variation in observed measurements of a continuous biomarker into population variability and variability arising from the measurement procedure. We present two frequentist approaches with analytical solutions, but these approaches perform poorly in a dataset of conditionally repeated blood donor hemoglobin measurements where normality assumptions are not met. We then develop a Bayesian hierarchical framework that allows for different distributional assumptions, which we apply to the blood donor hemoglobin dataset. Using a Bayesian hierarchical model that assumes normally distributed population hemoglobin and heavy tailed -distributed measurement variation, we found that the total measurement variation accounted for 22\% of the total variance among females and 25\% among males, with population standard deviations of for female donors and for male donors. Our Bayesian framework can use data resulting from any clinical process with conditionally repeated biomarker measurements to estimate individuals' misclassification risk after one or more noisy continuous measurements and inform evidence-based conditional retesting decision rules.
Paper Structure (22 sections, 27 equations, 13 figures, 10 tables)

This paper contains 22 sections, 27 equations, 13 figures, 10 tables.

Figures (13)

  • Figure 1: (Top:) Distribution of the first Hb measurement of males and females at all blood donor visits. (Bottom:) scatterplot of the first (X1) and the repeat (X2) Hb measurement among the subset of donors with two measurements at the same visit.
  • Figure 2: Scatter plot of the initial and repeat measurement of 1000 simulated conditionally repeated measurements. Density plots show the marginal distribution of $x_1$ and $x_2$ when pairs are observed for all individuals (green) and when $x_2$ is only observed when $x_1$ falling falls below $c=13$ g/dL (orange).
  • Figure 3: Difference between estimated $\hat{\sigma}_{\rm meas}$ and simulated $\sigma_{\mathrm{meas}}=0.8$ g/dL. Shown is the mean with 95% CI of 200 repeats of simulated datasets with recheck probability parameter $r$ from \ref{['eq:recheck']}.
  • Figure 4: Violin plots depicting the distribution of bootstrapped estimates for population variance and measurement error variance using the conditional expectation method and maximum likelihood method, stratified by sex
  • Figure 5: Percent difference between the estimated measurement error variance and the true variance (accounting for degrees of freedom) for conditional expectation and maximum likelihood methods under truncation with a $t$ distributed error measurement. Points show mean bias across 100 simulations with $95\%$ confidence intervals, plotted against the degrees of freedom (df) on a log scale.
  • ...and 8 more figures