Table of Contents
Fetching ...

Autoregressive networks with dependent edges

Jinyuan Chang, Qin Fang, Eric D. Kolaczyk, Peter W. MacDonald, Qiwei Yao

TL;DR

The paper develops a flexible autoregressive framework for discrete-time dynamic networks with dependent edges, enabling explicit modeling of features like transitivity and degree heterogeneity through conditional edge probabilities.Edges are modeled as Bernoulli variables with probabilities that depend on past network states and other edges, yielding a tractable martingale-difference structure for asymptotic theory even when the network size grows with the sample size.Three concrete AR models—degree heterogeneity, persistence, and transitivity—illustrate the framework, and a comprehensive estimation strategy combines initial partial-likelihood estimation with projection-based improvements to achieve fast, high-dimensional convergence and inference.The approach unifies AR-network modeling with TERGMs/STERGMs, provides practical estimation in large networks, and demonstrates strong performance on real-data applications (email networks) and simulations, highlighting its potential for scalable, interpretable dynamic network analysis.

Abstract

We propose an autoregressive framework for modelling dynamic networks with dependent edges. It encompasses models that accommodate, for example, transitivity, degree heterogenenity, and other stylized features often observed in real network data. By assuming the edges of networks at each time are independent conditionally on their lagged values, the models, which exhibit a close connection with temporal ERGMs, facilitate both simulation and the maximum likelihood estimation in a straightforward manner. Due to the possibly large number of parameters in the models, the natural MLEs may suffer from slow convergence rates. An improved estimator for each component parameter is proposed based on an iteration employing projection, which mitigates the impact of the other parameters (Chang et al., 2021; Chang et al., 2023). Leveraging a martingale difference structure, the asymptotic distribution of the improved estimator is derived without the assumption of stationarity. The limiting distribution is not normal in general, although it reduces to normal when the underlying process satisfies some mixing conditions. Illustration with a transitivity model was carried out in both simulation and a real network data set.

Autoregressive networks with dependent edges

TL;DR

The paper develops a flexible autoregressive framework for discrete-time dynamic networks with dependent edges, enabling explicit modeling of features like transitivity and degree heterogeneity through conditional edge probabilities.Edges are modeled as Bernoulli variables with probabilities that depend on past network states and other edges, yielding a tractable martingale-difference structure for asymptotic theory even when the network size grows with the sample size.Three concrete AR models—degree heterogeneity, persistence, and transitivity—illustrate the framework, and a comprehensive estimation strategy combines initial partial-likelihood estimation with projection-based improvements to achieve fast, high-dimensional convergence and inference.The approach unifies AR-network modeling with TERGMs/STERGMs, provides practical estimation in large networks, and demonstrates strong performance on real-data applications (email networks) and simulations, highlighting its potential for scalable, interpretable dynamic network analysis.

Abstract

We propose an autoregressive framework for modelling dynamic networks with dependent edges. It encompasses models that accommodate, for example, transitivity, degree heterogenenity, and other stylized features often observed in real network data. By assuming the edges of networks at each time are independent conditionally on their lagged values, the models, which exhibit a close connection with temporal ERGMs, facilitate both simulation and the maximum likelihood estimation in a straightforward manner. Due to the possibly large number of parameters in the models, the natural MLEs may suffer from slow convergence rates. An improved estimator for each component parameter is proposed based on an iteration employing projection, which mitigates the impact of the other parameters (Chang et al., 2021; Chang et al., 2023). Leveraging a martingale difference structure, the asymptotic distribution of the improved estimator is derived without the assumption of stationarity. The limiting distribution is not normal in general, although it reduces to normal when the underlying process satisfies some mixing conditions. Illustration with a transitivity model was carried out in both simulation and a real network data set.
Paper Structure (42 sections, 9 theorems, 207 equations, 54 figures, 6 tables, 1 algorithm)

This paper contains 42 sections, 9 theorems, 207 equations, 54 figures, 6 tables, 1 algorithm.

Key Result

Proposition 1

Let Conditions as:gambd--as:eigen hold, and $C_*=2(2C_1^{-2}+C_1^{-3})C_2^3+3(C_1^{-1}+C_1^{-2})C_2^2+C_1^{-1}C_2$ with $(C_1,C_2)$ specified in Condition as:gambd. Assume $\sup_{\boldsymbol \theta\in\boldsymbol \Theta}|\boldsymbol \theta-\boldsymbol \theta_0|_\infty<2C_3/(C_*s^3)$. As $n\rightarrow for any $\boldsymbol \theta\in\boldsymbol \Theta$ and $l\in[q]$, where $\bar{C}>0$ is a universal c

Figures (54)

  • Figure 1: Scatter plots of estimates $\{\hat{\xi}_i\}_{i=1}^{106}$ and $\{\hat{\eta}_i\}_{i=1}^{106}$ for periods 1 and 2, email interaction data. Circles are sized and coloured according to their level in the company organizational tree from 1 (no direct reports) to 5 (CEO). Level 1: black, smallest; level 2: red; level 3: green; level 4: cyan; level 5: purple, largest.
  • Figure 2: ROC curves for link prediction performance, email interaction data.
  • Figure S1: Time series plots of $\{D_t\}_{t=2}^{200}$, $\{D_{1,t}\}_{t=2}^{200}$, $\{D_{0,t}\}_{t=2}^{200}$, $\{\bar{D}_t\}_{t=2}^{200}$, $\{\bar{D}_{1,t}\}_{t=2}^{200}$ and $\{\bar{D}_{0,t}\}_{t=2}^{200}$ for the four simulated settings for $p = 100$. The black, red, green and blue curves correspond to the setting $(\xi_i, \eta_i, a, b) = (0.7,0.8,30,15),$$(0.6,0.7,20,20)$, $(0.6,0.7,15,10)$ and $(0.6,0.7,10,10)$, respectively.
  • Figure S2: Time series plots of $\{D_t\}_{t=2}^{200}$, $\{D_{1,t}\}_{t=2}^{200}$, $\{D_{0,t}\}_{t=2}^{200}$, $\{\bar{D}_t\}_{t=2}^{200}$, $\{\bar{D}_{1,t}\}_{t=2}^{200}$ and $\{\bar{D}_{0,t}\}_{t=2}^{200}$ for the four simulated settings for $p = 150$. The black, red, green and blue curves correspond to the setting $(\xi_i, \eta_i, a, b) = (0.7,0.8,30,15),$$(0.6,0.7,20,20)$, $(0.6,0.7,15,10)$ and $(0.6,0.7,10,10)$, respectively.
  • Figure S3: Time series plots of $\{D_t\}_{t=2}^{500}$, $\{D_{1,t}\}_{t=2}^{500}$, $\{D_{0,t}\}_{t=2}^{500}$ (left to right) for degree heterogeneity model when $\xi_i = 0.1$ and $\eta_i \in (0.1, 0.3, 0.5).$ The 25 colored curves in each subplot correspond to 25 settings where $a_0=a_1\in\{0.2,0.4,0.6,0.8,1\}$ and $b_0=b_1\in\{0.2,0.4,0.6,0.8,1\}$.
  • ...and 49 more figures

Theorems & Definitions (17)

  • Definition 1: AR($m$) networks
  • Definition 2: Global/local parameters
  • Proposition 1
  • Theorem 1
  • Remark 1
  • Remark 2
  • Proposition 2
  • Remark 3
  • Remark 4
  • Remark 5
  • ...and 7 more