Table of Contents
Fetching ...

On the uncertainty from the first-stage estimation of prognostic covariate adjustment in randomized controlled trials

Nodoka Seya, Masataka Taguri

Abstract

A method for covariate adjustment in randomized controlled trials is prognostic covariate adjustment (PROCOVA). PROCOVA is a two-sample two-stage estimation method. In the first stage, the prognostic score, which is the conditional expectation of an outcome given covariates under control treatment, is estimated using historical data. In the second stage, ANCOVA with the estimated prognostic score and treatment assignment as explanatory variables is performed, and the average treatment effect is estimated. Although the prognostic score is actually estimated in this procedure, the variance estimator, which treats the prognostic score as known, has been used. Furthermore, the difference in asymptotic variance between cases where the prognostic score is known and cases where it is estimated has not been clarified. In this study, we derived these two asymptotic variances and showed that they are equal. We also constructed the variance estimator, which treats the prognostic score as known, and the variance estimator, which accounts for the prognostic score estimation, and compared their performance through simulation studies and data application. Both variance estimators are asymptotically valid. When historical data is small, the variance estimator which explicitly accounts for the prognostic score estimation is recommended if one prefers conservative inference.

On the uncertainty from the first-stage estimation of prognostic covariate adjustment in randomized controlled trials

Abstract

A method for covariate adjustment in randomized controlled trials is prognostic covariate adjustment (PROCOVA). PROCOVA is a two-sample two-stage estimation method. In the first stage, the prognostic score, which is the conditional expectation of an outcome given covariates under control treatment, is estimated using historical data. In the second stage, ANCOVA with the estimated prognostic score and treatment assignment as explanatory variables is performed, and the average treatment effect is estimated. Although the prognostic score is actually estimated in this procedure, the variance estimator, which treats the prognostic score as known, has been used. Furthermore, the difference in asymptotic variance between cases where the prognostic score is known and cases where it is estimated has not been clarified. In this study, we derived these two asymptotic variances and showed that they are equal. We also constructed the variance estimator, which treats the prognostic score as known, and the variance estimator, which accounts for the prognostic score estimation, and compared their performance through simulation studies and data application. Both variance estimators are asymptotically valid. When historical data is small, the variance estimator which explicitly accounts for the prognostic score estimation is recommended if one prefers conservative inference.

Paper Structure

This paper contains 24 sections, 7 theorems, 149 equations, 5 figures, 1 table.

Key Result

Theorem 1

Assume suitable regularity conditions for M-estimation, $\mathcal{D}\perp\mathcal{\tilde{D}}$ and $\lim_{n,\tilde{n}\to\infty} {n}/{\tilde{n}}=:\kappa\in[0,\infty)$. Then, the following statements hold: where $\Omega = \mathbb{E}\!\left[\psi(O;\beta^*,\theta^*)\psi(O;\beta^*,\theta^*)^\top\right]$, $Q_0=\mathbb{E}\!\left[\frac{\partial}{\partial\beta^\top}\psi(O;\beta,\theta^*)\middle|_{\beta=\be

Figures (5)

  • Figure 1: Plots of the coverage probability of 95 % CI over 1000 simulations for Scenario D-5. "beta0", "betaA, "beta1" represents the intercept $\beta_0$, the coefficient for the treatment assignment $\beta_A$, the coefficient for the prognostic score $\beta_1$ in the PROCOVA model \ref{['eq: PROCOVA linear']}, respectively. "fix", represents $e^\top\hat{V}_{\text{fix}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$. "est", represents $e^\top\hat{V}_{\text{est}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$. The x-axis represents the sample size of trial data $n$. The sample size of historical data is $\tilde{n} = 10n$. The y-axis represents the coverage probability which is the proportion of 1000 simulations in which the 95% CI using each variance estimator includes the true value.
  • Figure 2: Plots of the mean of the ratio of two variance estimators over 1000 simulations for Scenario D-5. "beta0", "betaA, "beta1" represents the intercept $\beta_0$, the coefficient for the treatment assignment $\beta_A$, the coefficient for the prognostic score $\beta_1$ in the PROCOVA model \ref{['eq: PROCOVA linear']}, respectively. The x-axis represents the sample size of trial data $n$. The sample size of historical data is $\tilde{n} = 10n$. The y-axis represents the mean of the ratio of two variance estimators, i.e., $e^\top\hat{V}_{\text{est}}e/e^\top\hat{V}_{\text{fix}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$ for $\beta_1$, over 1000 simulations.
  • Figure 3: Plots of the coverage probability of the 95 % CI for $\beta_A$ in the PROCOVA model \ref{['eq: PROCOVA linear']} over 1000 simulations for Scenario D-5. "fix", represents $e^\top\hat{V}_{\text{fix}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$. "est", represents $e^\top\hat{V}_{\text{est}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$. The sample size of trial data is $n=100$. The x-axis represents the sample size of historical data $\tilde{n}$. The y-axis represents the coverage probability which is the proportion of 1000 simulations in which the 95% CI using each variance estimator includes the true value.
  • Figure 4: Plots of the coverage probability of 95 % CI over 1000 simulations for Scenario D-5. "beta0", "betaA, "beta1" represents the intercept $\beta_0$, the coefficient for the treatment assignment $\beta_A$, the coefficient for the prognostic score $\beta_1$ in the PROCOVA model \ref{['eq: PROCOVA linear']}, respectively. "fix", represents $e^\top\hat{V}_{\text{fix}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$. "est", represents $e^\top\hat{V}_{\text{est}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$. The sample size of trial data is $n=1000$. The x-axis represents the sample size of historical data $\tilde{n}$. The y-axis represents the coverage probability which is the proportion of 1000 simulations in which the 95% CI using each variance estimator includes the true value.
  • Figure 5: Plots of the ratio of two variance estimators when PROCOVA is applied to the ACTG 175 data with varying the sample size of historical data. "beta0", "betaA, "beta1" represents the intercept $\beta_0$, the coefficient for the treatment assignment $\beta_A$, the coefficient for the prognostic score $\beta_1$ in the PROCOVA model \ref{['eq: PROCOVA linear']}, respectively. The sample size of trial data is $n=200$. The x-axis represents the sample size of historical data $\tilde{n}$. The y-axis represents the ratio of two variance estimators, i.e., $e^\top\hat{V}_{\text{est}}e/e^\top\hat{V}_{\text{fix}}e$ with $e=(1,0,0)^\top$ for $\beta_0$, with $e=(0,1,0)^\top$ for $\beta_A$ and $e=(0,0,1)^\top$ for $\beta_1$.

Theorems & Definitions (13)

  • Theorem 1
  • Corollary 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • proof
  • proof
  • proof
  • proof
  • ...and 3 more