Inference in semiparametric formation models for directed networks
Lianqiang Qu, Lu Chen, Ting Yan, Yuguo Chen
TL;DR
The paper develops a semiparametric directed-network formation model with node-specific out-degree and in-degree effects $\alpha_i$ and $\beta_j$, plus a covariate-driven homophily term $X_{ij}^\top\gamma$, and latent noise $\varepsilon_{ij}$. It introduces a projection-based estimation scheme that uses a kernel-based density estimator for a special regressor and yields an unbiased homophily estimator via projection, followed by a constrained least squares estimation of the degree parameters. The authors establish consistency and high-dimensional central limit theorems for the estimators, enabling valid inference and tests for sparse signals, support recovery, and degree heterogeneity, with explicit variance formulas involving $V=U^\top U$ and $\sigma_\epsilon^2$. They demonstrate finite-sample performance through simulations and a real data application (Lazega partners) showing improved fit and meaningful homophily and heterogeneity insights, and provide extensions to conditionally independent noises and weighted networks. The framework offers robust, scalable semi-parametric inference for directed networks, mitigating model misspecification risk while enabling precise hypothesis testing and variable selection in high dimensions.
Abstract
We propose a semiparametric model for dyadic link formations in directed networks. The model contains a set of degree parameters that measure different effects of popularity or outgoingness across nodes, a regression parameter vector that reflects the homophily effect resulting from the nodal attributes or pairwise covariates associated with edges, and a set of latent random noises with unknown distributions. Our interest lies in inferring the unknown degree parameters and homophily parameters. The dimension of the degree parameters increases with the number of nodes. Under the high-dimensional regime, we develop a kernel-based least squares approach to estimate the unknown parameters. The major advantage of our estimator is that it does not encounter the incidental parameter problem for the homophily parameters. We prove consistency of all the resulting estimators of the degree parameters and homophily parameters. We establish high-dimensional central limit theorems for the proposed estimators and provide several applications of our general theory, including testing the existence of degree heterogeneity, testing sparse signals and recovering the support. Simulation studies and a real data application are conducted to illustrate the finite sample performance of the proposed methods.
