Table of Contents
Fetching ...

Sensitivity analysis for contamination in egocentric-network randomized trials with interference

Bar Weinstein, Daniel Nevo

TL;DR

This paper addresses contamination in egocentric-network randomized trials (ENRTs) operating under interference when full sociocentric data are unavailable. It develops a design-based framework showing that Horvitz-Thompson estimators for the indirect and direct effects are biased under network contamination and introduces bias-corrected estimators that depend on sensitivity parameters for missing edges, including an interaction parameter κ for egos. The authors implement a dual sensitivity-analysis toolkit—grid sensitivity analysis (GSA) and probabilistic bias analysis (PBA)—and illustrate their approach with simulations and the HIV Prevention Trials Network 037 study, revealing that ignoring contamination can misstate both indirect and direct effects. The methods are accompanied by an R package ENRTsensitivity and offer practical guidance for robustness assessment of ENRT causal estimates in settings where complete network data are impractical to obtain.

Abstract

Egocentric-Network Randomized Trials (ENRTs) are increasingly used to estimate causal effects under interference when measuring complete sociocentric network data is infeasible. ENRTs rely on egocentric network sampling, where a set of egos is first sampled, and each ego recruits a subset of its neighbors as alters. Treatments are then randomized across egos. While the observed ego-networks are disjoint by design, the underlying population network may contain edges connecting them, leading to contamination. Under a design-based framework, we show that the Horvitz-Thompson estimators of direct and indirect effects are biased whenever contamination is present. To address this, we derive bias-corrected estimators and propose a novel sensitivity analysis framework based on sensitivity parameters representing the probability or expected number of missing edges. This framework is implemented via both grid sensitivity analysis and probabilistic bias analysis, providing researchers with a flexible tool to assess the robustness of the causal estimators to contamination. We apply our methodology to the HIV Prevention Trials Network 037 study, finding that ignoring contamination may lead to underestimation of indirect effects and overestimation of direct effects.

Sensitivity analysis for contamination in egocentric-network randomized trials with interference

TL;DR

This paper addresses contamination in egocentric-network randomized trials (ENRTs) operating under interference when full sociocentric data are unavailable. It develops a design-based framework showing that Horvitz-Thompson estimators for the indirect and direct effects are biased under network contamination and introduces bias-corrected estimators that depend on sensitivity parameters for missing edges, including an interaction parameter κ for egos. The authors implement a dual sensitivity-analysis toolkit—grid sensitivity analysis (GSA) and probabilistic bias analysis (PBA)—and illustrate their approach with simulations and the HIV Prevention Trials Network 037 study, revealing that ignoring contamination can misstate both indirect and direct effects. The methods are accompanied by an R package ENRTsensitivity and offer practical guidance for robustness assessment of ENRT causal estimates in settings where complete network data are impractical to obtain.

Abstract

Egocentric-Network Randomized Trials (ENRTs) are increasingly used to estimate causal effects under interference when measuring complete sociocentric network data is infeasible. ENRTs rely on egocentric network sampling, where a set of egos is first sampled, and each ego recruits a subset of its neighbors as alters. Treatments are then randomized across egos. While the observed ego-networks are disjoint by design, the underlying population network may contain edges connecting them, leading to contamination. Under a design-based framework, we show that the Horvitz-Thompson estimators of direct and indirect effects are biased whenever contamination is present. To address this, we derive bias-corrected estimators and propose a novel sensitivity analysis framework based on sensitivity parameters representing the probability or expected number of missing edges. This framework is implemented via both grid sensitivity analysis and probabilistic bias analysis, providing researchers with a flexible tool to assess the robustness of the causal estimators to contamination. We apply our methodology to the HIV Prevention Trials Network 037 study, finding that ignoring contamination may lead to underestimation of indirect effects and overestimation of direct effects.
Paper Structure (36 sections, 3 theorems, 81 equations, 11 figures, 3 tables)

This paper contains 36 sections, 3 theorems, 81 equations, 11 figures, 3 tables.

Key Result

Proposition 1

Under ENRT design and Assumptions ass:bernoulli_design--ass:consis, where $\pi_i^a = \Pr(F_i=1\mid i\in \mathcal{R}_a)$ and $\pi_i^e = \Pr(F_i=1 \mid i \in \mathcal{R}_e)$ are the probabilities that an alter and ego, respectively, are exposed to at least one treated neighbor in the population network $\boldsymbol{A}$.

Figures (11)

  • Figure 1: Illustration of egocentric sampling. Panel (A) shows the full $N=7$ population network $\boldsymbol{G}$. Panel (B) illustrates a single egocentric sample $\widetilde{\boldsymbol{G}}$: each ego (dark blue, $n_e=2$) reports only a subset of its true neighborhood as alters (medium blue, $n_a=3$). The observed network is composed of disjoint ego–networks (solid lines), while ego–ego and unreported ego–alter connections remain latent (dashed lines). All other edges are also unobserved (dotted lines).
  • Figure 2: Grid sensitivity analysis results for indirect effect using the augmented estimator $\widehat{IE}_{adj}^{aug}$. The x-axis represents the approximate total number of missing alter--ego edges $m^a$, while the y-axis represents the estimates values. Results are shown for both homogeneous (blue) and heterogeneous (orange) edge probabilities specifications.
  • Figure 3: Grid sensitivity analysis results for direct effect using the augmented estimator $\widehat{DE}_{adj}^{aug}$. The x-axis represents the approximate total number of missing ego--ego edges $m^e$, while the y-axis represents $\kappa$, the interaction sensitivity parameter. Contour lines and color shading represent the estimated values. Results are shown for both heterogeneous (left) and homogeneous (right) edge probability specifications.
  • Figure 4: Probabilistic bias analysis results for indirect effect (left) using the augmented estimator $\widehat{IE}_{adj}^{aug}$, and direct effect (right) using the augmented estimator $\widehat{DE}_{adj}^{aug}$. Results are shown as mean ($95\%$ intervals) for both homogeneous and heterogeneous edge probability specifications, accounting for both bias and statistical uncertainty. Colors represent different specified distributions for the total number of missing edges and the interaction sensitivity parameter $\kappa$.
  • Figure E.1: Distribution of number of alters per ego-network.
  • ...and 6 more figures

Theorems & Definitions (7)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Example 1: Homogeneous probabilities
  • Example 2: Homogeneous number of edges
  • Example 3: Heterogeneous probabilities
  • Example 4: Heterogeneous number of missing edges