Table of Contents
Fetching ...

Predicting the structure of dynamic graphs

Sevvandi Kandanaarachchi, Ziqi Xu, Stefan Westerlund

TL;DR

The paper tackles forecasting the evolving structure of discrete-time dynamic graphs, including unseen nodes and edges. It introduces a novel approach that adapts Flux Balance Analysis to graph forecasting by forecasting node degrees with ARIMA, building an incidence matrix $S$ from the union graph, and solving two FBA-inspired optimization formulations with edge-importance coefficients $\xi_{ij}$ to generate $\hat{\mathcal{G}}_{T+h}$. Evaluation on synthetic and real networks (PA growth, Facebook, and Hep-PH) demonstrates that the proposed formulations, particularly with harmonic-decay (C5) and last-seen (C6) coefficients, reduce node, edge, and density errors compared to last-seen baselines. An accompanying R package implements the methodology, enabling practical application and further extensions to weighted or directed graphs.

Abstract

Many aspects of graphs have been studied in depth. However, forecasting the structure of a graph at future time steps incorporating unseen, new nodes and edges has not gained much attention. In this paper, we present such an approach. Using a time series of graphs, we forecast graphs at future time steps. We use time series forecasting methods to predict the node degree at future time points and combine these forecasts with flux balance analysis -- a linear programming method used in biochemistry -- to obtain the structure of future graphs. We evaluate this approach using synthetic and real-world datasets and demonstrate its utility and applicability.

Predicting the structure of dynamic graphs

TL;DR

The paper tackles forecasting the evolving structure of discrete-time dynamic graphs, including unseen nodes and edges. It introduces a novel approach that adapts Flux Balance Analysis to graph forecasting by forecasting node degrees with ARIMA, building an incidence matrix from the union graph, and solving two FBA-inspired optimization formulations with edge-importance coefficients to generate . Evaluation on synthetic and real networks (PA growth, Facebook, and Hep-PH) demonstrates that the proposed formulations, particularly with harmonic-decay (C5) and last-seen (C6) coefficients, reduce node, edge, and density errors compared to last-seen baselines. An accompanying R package implements the methodology, enabling practical application and further extensions to weighted or directed graphs.

Abstract

Many aspects of graphs have been studied in depth. However, forecasting the structure of a graph at future time steps incorporating unseen, new nodes and edges has not gained much attention. In this paper, we present such an approach. Using a time series of graphs, we forecast graphs at future time steps. We use time series forecasting methods to predict the node degree at future time points and combine these forecasts with flux balance analysis -- a linear programming method used in biochemistry -- to obtain the structure of future graphs. We evaluate this approach using synthetic and real-world datasets and demonstrate its utility and applicability.
Paper Structure (18 sections, 1 theorem, 24 equations, 5 figures, 2 tables)

This paper contains 18 sections, 1 theorem, 24 equations, 5 figures, 2 tables.

Key Result

Theorem 3.2

Let $\mathcal{U} \, , \bm{u} \in \mathcal{U}$ describe the solution space of the optimization problem described in Formulation 1. Then the cardinality of $\mathcal{U}$, denoted by $\left\vert \mathcal{U} \right\vert$ satisfies the following bounds where $C_1$ and $C_2$ depend on $n$ and $f_u(\bm{d})$.

Figures (5)

  • Figure 1: The graph time series $\{\mathcal{G}_t\}_{t=1}^T$ is used to extract multiple time series: $n_t$ is the number of nodes in $\mathcal{G}_t$ and $d_{i,t}$ is the degree of each node $i$ at time $t$. The forecast $\hat{n}_{T+h}$ is computed using the time series $\{n_t\}_{t=1}^T$ and the forecast of the degree of existing nodes $\hat{d}_{i, T+h}$ is computed using the time series $\{d_{i,t}\}_{t=1}^T$ using ARIMA models. The mean degree of nodes that appear for the first time, $d_{avg}$ is computed. The number of new nodes $n_{new} = \hat{n}_{T+h} - n_T$. Using $\{\mathcal{G}_t\}_{t=1}^T$ the union graph $\mathcal{G}_U$ and matrix $S$ are computed. Coefficients for optimization $\xi_{ij}$ are determined depending on the scheme. Finally, adapted Flux Balance Analysis is carried out giving $\hat{\mathcal{G}}_{T+h}$ as the output.
  • Figure 2: A simple illustration of the forecast graph distribution in terms of $\gamma$ and $u$. The parameter $\gamma$ gives different $\hat{n}_{T+h}$ values. The 2 graphs on the left have 4 nodes each, corresponding to a single $\gamma$ (or $\hat{n}_{T+h}$) value, but has a different number of edges corresponding to different $u$ values. Similarly, the two graphs on the right have 7 nodes corresponding to a higher value of $\gamma$ with the rightmost graph having more edges due to a higher $u$.
  • Figure 3: Forecast synthetic networks using a sequence of 25 networks for $h \in \{ 1, 3, 5 \}$ illustrating the forecast graphs at time steps 26, 28 and 30. Node colour depicts the degree of the node, with darker nodes having a higher degree.
  • Figure 4: Edges and vertices in each network for realworld datasets.
  • Figure :

Theorems & Definitions (3)

  • Definition 3.1
  • Theorem 3.2
  • proof