Table of Contents
Fetching ...

Flexible Imputation of Incomplete Network Data

Ge Sun, Weisheng Zhang

Abstract

Sampled network data are common in empirical research because collecting full network information is costly, but using sampled networks can lead to biased estimates. We propose a nonparametric imputation method for sampled networks and show that empirical analysis based on imputed networks yields consistent parameter estimates. Our approach imputes missing network links by combining a projection onto covariates with a local two-way fixed-effects regression, which avoids parametric assumptions, does not rely on low-rank restrictions, and flexibly accommodates both observed covariates and unobserved heterogeneity. We establish entrywise convergence rates for the imputed matrix and prove the consistency of GMM estimators based on the imputed network. We further derive the convergence rate of the corresponding estimator in the linear-in-means peer-effects model. Simulations show strong performance of our method both in terms of imputation accuracy and in downstream empirical analysis. We illustrate our method with an application to the microfinance network data of Banerjee et al. (2013).

Flexible Imputation of Incomplete Network Data

Abstract

Sampled network data are common in empirical research because collecting full network information is costly, but using sampled networks can lead to biased estimates. We propose a nonparametric imputation method for sampled networks and show that empirical analysis based on imputed networks yields consistent parameter estimates. Our approach imputes missing network links by combining a projection onto covariates with a local two-way fixed-effects regression, which avoids parametric assumptions, does not rely on low-rank restrictions, and flexibly accommodates both observed covariates and unobserved heterogeneity. We establish entrywise convergence rates for the imputed matrix and prove the consistency of GMM estimators based on the imputed network. We further derive the convergence rate of the corresponding estimator in the linear-in-means peer-effects model. Simulations show strong performance of our method both in terms of imputation accuracy and in downstream empirical analysis. We illustrate our method with an application to the microfinance network data of Banerjee et al. (2013).

Paper Structure

This paper contains 61 sections, 12 theorems, 199 equations, 4 figures, 7 tables, 3 algorithms.

Key Result

Lemma 1

Under Assumption assumption:regularity, there exist constants $\gamma_1, \gamma_2 >0$ such that $\blacktriangleleft$$\blacktriangleleft$

Figures (4)

  • Figure : (a) Full network
  • Figure : (a) Full network
  • Figure : (b) Sampled network by sampling $\left\{1,2\right\}$, red nodes are not in the sample, dashed edges are not observed
  • Figure : (c) From full adjacency matrix to partially observed adjacency matrix, $\times$ denotes missing entry

Theorems & Definitions (20)

  • Example 1: Stochastic block model
  • Example 2: Network formation with transferable utility
  • Lemma 1
  • Example 1: Continued
  • Example 2: Continued
  • Theorem 2: Imputation errors
  • Example 3: Regression on centrality
  • Example 4: Linear-in-means peer-effects model
  • Example 3: Continued
  • Example 4: Continued
  • ...and 10 more