Table of Contents
Fetching ...

Isolated vertices in two duplication-divergence models with edge deletion

Tiffany Y. Y. Lo, Gesine Reinert, Ruihua Zhang

TL;DR

Two novel models that incorporate random edge deletions into the duplication-divergence framework are presented, indicating that as the network size tends to infinity, the proportion of isolated vertices can converge to a limit that is neither 0 or 1.

Abstract

Duplication-divergence models are a popular model for the evolution of gene and protein interaction networks. However, existing duplication-divergence models often neglect realistic features such as loss of interactions. Thus, in this paper we present two novel models that incorporate random edge deletions into the duplication-divergence framework. As in protein-protein interaction networks, with proteins as vertices and interactions as edges, by design isolated vertices tend to be rare, our main focus is on the number of isolated vertices; our main result gives lower and upper bounds for the proportion of isolated vertices, when the network size is large. Using these bounds we identify the parameter regimes for which almost all vertices are typically isolated; and also show that there are parameter regimes in which the proportion of isolated vertices can be bounded away from 0 and 1 with high probability. In addition, we find regimes in which the proportion of isolated vertices tends to be small. The proof relies on a standard martingale argument, which in turn requires a careful analysis of the first two moments of the expected degree distribution. The theoretical findings are illustrated by simulations, indicating that as the network size tends to infinity, the proportion of isolated vertices can converge to a limit that is neither 0 or 1.

Isolated vertices in two duplication-divergence models with edge deletion

TL;DR

Two novel models that incorporate random edge deletions into the duplication-divergence framework are presented, indicating that as the network size tends to infinity, the proportion of isolated vertices can converge to a limit that is neither 0 or 1.

Abstract

Duplication-divergence models are a popular model for the evolution of gene and protein interaction networks. However, existing duplication-divergence models often neglect realistic features such as loss of interactions. Thus, in this paper we present two novel models that incorporate random edge deletions into the duplication-divergence framework. As in protein-protein interaction networks, with proteins as vertices and interactions as edges, by design isolated vertices tend to be rare, our main focus is on the number of isolated vertices; our main result gives lower and upper bounds for the proportion of isolated vertices, when the network size is large. Using these bounds we identify the parameter regimes for which almost all vertices are typically isolated; and also show that there are parameter regimes in which the proportion of isolated vertices can be bounded away from 0 and 1 with high probability. In addition, we find regimes in which the proportion of isolated vertices tends to be small. The proof relies on a standard martingale argument, which in turn requires a careful analysis of the first two moments of the expected degree distribution. The theoretical findings are illustrated by simulations, indicating that as the network size tends to infinity, the proportion of isolated vertices can converge to a limit that is neither 0 or 1.
Paper Structure (7 sections, 12 theorems, 97 equations, 4 figures)

This paper contains 7 sections, 12 theorems, 97 equations, 4 figures.

Key Result

Theorem 3.1

For Model $A$, assume that $q$ and $r$ are such that $u := 1 -2q(1-r) > 0.$ Then, regardless of the initial graph $G_{m_0}$,

Figures (4)

  • Figure 2: Evolution of the average proportion of isolated vertices of Model A over time, initialised either with a single edge (top) or a combination of a single vertex and an edge (bottom). The simulations are run for 1500 steps when $r=0$, and 1000 steps when $r \neq 0$, as the rate of convergence for the proportion of isolated vertices appears to increase as $r$ increases. Different values of $p$ are represented by different colours: red: $p=0$; orange: $p=0.2$; green: $p=2-\sqrt{22}/3\approx0.4365$; blue: $p=0.6$; purple: $p=0.8$. For $r = 0$, Theorem \ref{['th:LBk']} requires $q \leq 1/2$. For $r \neq 0$ and $q = 0.9$, Theorem \ref{['th:LBk']} requires $p > 2 - \sqrt{22}/3$, subject to $q \leq \min \{1, 1/(2(1-r))\}$. Thus, the solid lines represent parameter sets satisfying the conditions of Theorem \ref{['th:LBk']}, while the dashed lines correspond to cases where these conditions are not met. Shaded regions correspond to $\pm 2$ standard errors; capped at $0$ and $1$. The coloured bars on the right indicate the intervals $[\rho_0,(\rho_1\wedge 1)]$, where $\rho_0$ in \ref{['de:rho']} does not depend on $p$. When $r=1$, $\rho_1$ in \ref{['rho1']} no longer depends on $p$, so the interval $[\rho_0,\rho_1]$ is represented by a single bar in magenta.
  • Figure 3: Red: $p=0$; Orange: $p=2 - \sqrt{182} / 7 \approx 0.0728$; Green: $0.4$; Blue: $p=0.6$; Purple: $p=0.8$. For $r \neq 0$ and $q = 0.7$, Theorem 3.2 requires $p > 2 - \sqrt{182} / 7$, subject to $q \leq \min \{1, 1/(2(1-r))\}$. While the conditions for Theorem 3.2 are not met when $q=0.7$$r=0$, the proportion of isolated vertices appear to converge to 1, even when $p$ is small. For $r=0.5,1$, the result of Theorem 3.2 seems to hold even when $p\le \sqrt{182}/7$ (in which case $\tau\ge 1$.)
  • Figure 4: Red: $p=0$; Orange: $p=0.2$; Green: $0.4$; Blue: $p=0.6$; Purple: $p=0.8$. When $q=r=0.5$, $\rho_1\ge 1$ for all values of $p$ so we represent $[\rho_0,(\rho_1\wedge 1)]$ in this case by a single cyan bar. When $r=0.5,1$, the average proportion of isolated vertices appears to be closer to $\rho_0$ than to $\rho_1$, for all $p$ considered here.
  • Figure 5: Red: $p=0$; Orange: $p=0.2$; Green: $0.4$; Blue: $p=0.6$; Purple: $p=0.8$; When $q=0.2$ and $r=0.5$, $\rho_1\ge 4/3$ for all values of $p$ so we represent $[\rho_0,(\rho_1\wedge 1)]$ in this case by a single cyan bar. When $q=0.2$ and $r=1$, $\rho_1=1.2\ge 1.$ When $r=0.5,1$, the average proportion of isolated vertices appears to be closer to $\rho_0$ than to $\rho_1$, for all $p$ considered here.

Theorems & Definitions (28)

  • Remark 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Remark 3.3
  • Corollary 3.1
  • Theorem 3.4: The $p=1$ case
  • Proposition 3.5: The $p=0, q=1$ case
  • Theorem 3.6: The $p=0, q=1$ case
  • Proposition 3.7
  • Proposition 3.8
  • ...and 18 more