Table of Contents
Fetching ...

Redefining shared information: a heterogeneity-adaptive framework for meta-analysis

Elizabeth M. Davis, Emily C. Hector

TL;DR

This paper develops a heterogeneity-adaptive meta-analysis in linear models that adapts to the amount of information shared between datasets that establishes the estimator's desirable inferential properties without assuming homogeneity of dataset parameters.

Abstract

Meta-analytic methods tend to take all-or-nothing approaches to study-level heterogeneity, assuming all studies are heterogeneous or homogeneous, leading to inefficiency and/or bias in estimation and inference. In this paper, we develop a heterogeneity-adaptive meta-analysis in linear models that adapts to the amount of information shared between datasets. The primary mechanism for the information-sharing is a shrinkage of dataset-specific distributions towards a new "centroid" distribution through a Kullback-Leibler divergence penalty. The Kullback-Leibler divergence is uniquely geometrically suited for measuring relative information between datasets, and leads to relatively simple closed form estimators with intuitive interpretations. We establish our estimator's desirable inferential properties without assuming homogeneity of dataset parameters. Among other results, we show that our estimator has a provably smaller mean squared error than the dataset-specific maximum likelihood estimators, and establish asymptotically valid inference procedures. A comprehensive set of simulations highlights our estimator's versatility, and an analysis of data from the eICU Collaborative Research Database illustrates its performance in a real-world setting.

Redefining shared information: a heterogeneity-adaptive framework for meta-analysis

TL;DR

This paper develops a heterogeneity-adaptive meta-analysis in linear models that adapts to the amount of information shared between datasets that establishes the estimator's desirable inferential properties without assuming homogeneity of dataset parameters.

Abstract

Meta-analytic methods tend to take all-or-nothing approaches to study-level heterogeneity, assuming all studies are heterogeneous or homogeneous, leading to inefficiency and/or bias in estimation and inference. In this paper, we develop a heterogeneity-adaptive meta-analysis in linear models that adapts to the amount of information shared between datasets. The primary mechanism for the information-sharing is a shrinkage of dataset-specific distributions towards a new "centroid" distribution through a Kullback-Leibler divergence penalty. The Kullback-Leibler divergence is uniquely geometrically suited for measuring relative information between datasets, and leads to relatively simple closed form estimators with intuitive interpretations. We establish our estimator's desirable inferential properties without assuming homogeneity of dataset parameters. Among other results, we show that our estimator has a provably smaller mean squared error than the dataset-specific maximum likelihood estimators, and establish asymptotically valid inference procedures. A comprehensive set of simulations highlights our estimator's versatility, and an analysis of data from the eICU Collaborative Research Database illustrates its performance in a real-world setting.
Paper Structure (28 sections, 5 theorems, 128 equations, 8 figures, 10 tables)

This paper contains 28 sections, 5 theorems, 128 equations, 8 figures, 10 tables.

Key Result

Theorem 1

There exists a $\boldsymbol{\pi}\neq \boldsymbol{0}$ such that $\mathsf{MSE}\{\boldsymbol{\widehat{\beta}}(\boldsymbol{\pi})\}< \mathsf{MSE}(\boldsymbol{\tilde{\beta}}).$ In the case where $\pi_j\equiv\pi$, the MSE-minimizing $\pi$ is where $\boldsymbol{A} = (\boldsymbol{K}^\prime \boldsymbol{X}^{\top}\boldsymbol{\Sigma}^{-1}\boldsymbol{X} \boldsymbol{K})^{-1} \boldsymbol{K}^\prime \boldsymbol{X}

Figures (8)

  • Figure 1: Panel A displays possible values for $\boldsymbol{\widehat{\theta}}(\boldsymbol{\pi})$ for four datasets. Any coordinate pair within the outlined shape can be obtained for an appropriate $\boldsymbol{\pi}$ vector. Panels B and C display possible values for each $\boldsymbol{\widehat{\beta}}_j(\boldsymbol{\pi})$ for $k=4$. For a given dataset, denoted by color, any coordinate pair within the polygon may be obtained for an appropriate $\boldsymbol{\pi}$ vector. In example A, arrows display a shift from $\boldsymbol{\pi} = (0.2, 0.2, 0.2, 0.2)$ to $\boldsymbol{\pi} = (0.8, 0.8, 0.8, 0.8)$. In example B, arrows display a shift from $\boldsymbol{\pi} = (0.5, 0.0, 0.5, 0.5)$ to $\boldsymbol{\pi} = (0.5, 0.5, 0.5, 0.3)$.
  • Figure 2: Each ray represents a line of equal $\boldsymbol{\widehat{\theta}} (\boldsymbol{\pi})$. As $\pi_1$ and $\pi_2$ get further from the origin, $\boldsymbol{\widehat{\beta}}(\boldsymbol{\pi})$ is more heavily weighted towards the combined estimator, borrowing more information.
  • Figure 3: A demonstration of the proposed correction to the UMSE. The solid line yields the optimal scaling, $c^{\star}$, as the root. The lower dashed line over-borrows, on average. Shifting the lower line up by $\delta$ to the parallel dashed line above it gives a root at $c^{\star}$.
  • Figure 4: $\boldsymbol{\pi}_{\textsf{HAM}}$ values across heterogeneity conditions when $k = 15$ in setting 2.
  • Figure 5: Coverage of $\widehat{\beta}_{\textsf{HAM}, j}$ versus $\beta_j$ for each study in setting 3.
  • ...and 3 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Corollary 2.1
  • Theorem 3
  • Theorem 4
  • proof