Table of Contents
Fetching ...

Meta-analysis of diagnostic test accuracy with multiple disease stages: combining stage-specific and merged-stage data

Efthymia Derezea, Nicky J Welton, Gabriel Rogers, Hayley E Jones

TL;DR

The paper develops a Bayesian multivariate framework to meta-analyze diagnostic test accuracy across multiple disease stages by jointly incorporating stage-specific and merged-stage data for both binary and continuous test results. By modeling stage proportions and allowing merged data to inform stage-specific sensitivities (and, for continuous tests, thresholds across stages), the approach can yield more precise and biologically plausible estimates when stage-specific data are sparse. The methods are demonstrated with simulations and applied to hepatocellular carcinoma screening data in cirrhosis, showing improved inference and the ability to correct implausible results that arise from limited stage-specific information. This joint modeling strategy has practical implications for better informing clinical screening decisions and health-economic analyses where stage-specific detection matters, and it provides a flexible framework that can be extended to more complex data structures (e.g., HSROC).

Abstract

For many conditions, it is of clinical importance to know not just the ability of a test to distinguish between those with and without the disease, but also the sensitivity to detect disease at different stages: in particular, the test's ability to detect disease at a stage most amenable to treatment. In a systematic review of test accuracy, pooled stage-specific estimates can be produced using subgroup analysis or meta-regression. However, this requires stage-specific data from each study, which is often not reported. Studies may however report test sensitivity for merged stage categories (e.g. stages I-II) or merged across all stages, together with information on the proportion of patients with disease at each stage. We demonstrate how to incorporate studies reporting merged stage data alongside studies reporting stage-specific data, to allow the inclusion of more studies in the meta-analysis. We consider both meta-analysis of tests with binary results, and meta-analysis of tests with continuous results, where the sensitivity to detect disease of each stage across the whole range of observed thresholds is estimated. The methods are demonstrated using a series of simulated datasets and applied to data from a systematic review of the accuracy of tests used to screen for hepatocellular carcinoma in people with liver cirrhosis. We show that incorporating studies with merged stage data can lead to more precise estimates and, in some cases, corrects biologically implausible results that can arise when the availability of stage-specific data is limited.

Meta-analysis of diagnostic test accuracy with multiple disease stages: combining stage-specific and merged-stage data

TL;DR

The paper develops a Bayesian multivariate framework to meta-analyze diagnostic test accuracy across multiple disease stages by jointly incorporating stage-specific and merged-stage data for both binary and continuous test results. By modeling stage proportions and allowing merged data to inform stage-specific sensitivities (and, for continuous tests, thresholds across stages), the approach can yield more precise and biologically plausible estimates when stage-specific data are sparse. The methods are demonstrated with simulations and applied to hepatocellular carcinoma screening data in cirrhosis, showing improved inference and the ability to correct implausible results that arise from limited stage-specific information. This joint modeling strategy has practical implications for better informing clinical screening decisions and health-economic analyses where stage-specific detection matters, and it provides a flexible framework that can be extended to more complex data structures (e.g., HSROC).

Abstract

For many conditions, it is of clinical importance to know not just the ability of a test to distinguish between those with and without the disease, but also the sensitivity to detect disease at different stages: in particular, the test's ability to detect disease at a stage most amenable to treatment. In a systematic review of test accuracy, pooled stage-specific estimates can be produced using subgroup analysis or meta-regression. However, this requires stage-specific data from each study, which is often not reported. Studies may however report test sensitivity for merged stage categories (e.g. stages I-II) or merged across all stages, together with information on the proportion of patients with disease at each stage. We demonstrate how to incorporate studies reporting merged stage data alongside studies reporting stage-specific data, to allow the inclusion of more studies in the meta-analysis. We consider both meta-analysis of tests with binary results, and meta-analysis of tests with continuous results, where the sensitivity to detect disease of each stage across the whole range of observed thresholds is estimated. The methods are demonstrated using a series of simulated datasets and applied to data from a systematic review of the accuracy of tests used to screen for hepatocellular carcinoma in people with liver cirrhosis. We show that incorporating studies with merged stage data can lead to more precise estimates and, in some cases, corrects biologically implausible results that can arise when the availability of stage-specific data is limited.

Paper Structure

This paper contains 17 sections, 15 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Results from a meta-regression of the sensitivity of AFP in detecting HCC, with the proportion of HCCs that were at a 'very early' stage as a covariate, acting on the location parameter of the multiple thresholds model. The coefficient was estimated as -4.89 (95$\%$ CrI -9.94, -1.45). Data and results are shown for the threshold of 20ng/ml but the analysis is based on all available thresholds (a total of 31 data points). The shaded area represents the 95% CrIs around estimated sensitivity.
  • Figure 2: Accuracy of tests to detect HCC in people with cirrhosis. The circles represent summary estimates of the probability of a positive test result and the bars around them are 95% CrIs. (US=Ultrasound, MRI=Magnetic resonance imaging, CT=Computed tomography, AFP=Alpha-fetoprotein, AFP-L3=Lens culinaris agglutinin-reactive fraction of fetoprotein)
  • Figure 3: Results from models fitted to artificial data, Scenario I. Solid lines show results from fitting models to incomplete data. Dashed lines (identical across the two plots) show results from fitting models to the ideal full data, if it were available, and are shown for comparison. Shaded areas represent 95% CrIs around the point estimates. The circles representing the observed points vary in diameter according to the sample size. Points connected by a gray line represent data belonging to the same study. FPF=False positive fraction.
  • Figure 4: Results from models fitted to artificial data, Scenario II. Solid lines show results from fitting models to incomplete data. Dashed lines (identical across the two plots) show results from fitting models to the ideal full data, if it were available, and are shown for comparison. Shaded areas represent 95% CrIs around the point estimates. The diameter of the observed points varies according to sample size. FPF=False positive fraction.
  • Figure 5: Pooled estimates of the probability of a positive test result for the AFP data from the HCC review, along with 95% CrIs represented by the shaded areas. The diameter of the points varies according to sample size.