Table of Contents
Fetching ...

The Robustness of Structural Features in Species Interaction Networks

Sanaz Hasanzadeh Fard, Emily Dolson

TL;DR

The paper addresses how missing interactions bias topology metrics in species interaction networks and proposes a robustness analysis across six metrics and four community-detection algorithms using 148 bipartite networks from the Web of Life dataset. It simulates ground-truth networks by adding up to $m/2$ edges to observed graphs and assesses metric sensitivity, revealing substantial variation in robustness across metrics and network types. Notably, the number of non-zero eigenvalues emerges as the most robust feature, while community-detection results depend strongly on the algorithm (CNM and Louvain being more stable than Label Propagation or Girvan-Newman). These findings guide researchers in selecting robust network features for noisy ecological data and motivate sensitivity analyses as part of standard network-based ecological inference.

Abstract

Species interaction networks are a powerful tool for describing ecological communities; they typically contain nodes representing species, and edges representing interactions between those species. For the purposes of drawing abstract inferences about groups of similar networks, ecologists often use graph topology metrics to summarize structural features. However, gathering the data that underlies these networks is challenging, which can lead to some interactions being missed. Thus, it is important to understand how much different structural metrics are affected by missing data. To address this question, we analyzed a database of 148 real-world bipartite networks representing four different types of species interactions (pollination, host-parasite, plant-ant, and seed-dispersal). For each network, we measured six different topological properties: number of connected components, variance in node betweenness, variance in node PageRank, largest Eigenvalue, the number of non-zero Eigenvalues, and community detection as determined by four different algorithms. We then tested how these properties change as additional edges -- representing data that may have been missed -- are added to the networks. We found substantial variation in how robust different properties were to the missing data. For example, the Clauset-Newman-Moore and Louvain community detection algorithms showed much more gradual change as edges were added than the label propagation and Girvan-Newman algorithms did, suggesting that the former are more robust. Robustness also varied for some metrics based on interaction type. These results provide a foundation for selecting network properties to use when analyzing messy ecological network data.

The Robustness of Structural Features in Species Interaction Networks

TL;DR

The paper addresses how missing interactions bias topology metrics in species interaction networks and proposes a robustness analysis across six metrics and four community-detection algorithms using 148 bipartite networks from the Web of Life dataset. It simulates ground-truth networks by adding up to edges to observed graphs and assesses metric sensitivity, revealing substantial variation in robustness across metrics and network types. Notably, the number of non-zero eigenvalues emerges as the most robust feature, while community-detection results depend strongly on the algorithm (CNM and Louvain being more stable than Label Propagation or Girvan-Newman). These findings guide researchers in selecting robust network features for noisy ecological data and motivate sensitivity analyses as part of standard network-based ecological inference.

Abstract

Species interaction networks are a powerful tool for describing ecological communities; they typically contain nodes representing species, and edges representing interactions between those species. For the purposes of drawing abstract inferences about groups of similar networks, ecologists often use graph topology metrics to summarize structural features. However, gathering the data that underlies these networks is challenging, which can lead to some interactions being missed. Thus, it is important to understand how much different structural metrics are affected by missing data. To address this question, we analyzed a database of 148 real-world bipartite networks representing four different types of species interactions (pollination, host-parasite, plant-ant, and seed-dispersal). For each network, we measured six different topological properties: number of connected components, variance in node betweenness, variance in node PageRank, largest Eigenvalue, the number of non-zero Eigenvalues, and community detection as determined by four different algorithms. We then tested how these properties change as additional edges -- representing data that may have been missed -- are added to the networks. We found substantial variation in how robust different properties were to the missing data. For example, the Clauset-Newman-Moore and Louvain community detection algorithms showed much more gradual change as edges were added than the label propagation and Girvan-Newman algorithms did, suggesting that the former are more robust. Robustness also varied for some metrics based on interaction type. These results provide a foundation for selecting network properties to use when analyzing messy ecological network data.

Paper Structure

This paper contains 15 sections, 7 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: PageRank. Initialization step (left) and final PageRank value after nodes' interactions (right). The size of nodes has a direct relationship with their PageRank value.
  • Figure 2: Performance of different community detection algorithms on the same graph; detected communities in the graph (up), detected communities after adding 7 random edges (down).
  • Figure 3: Graph (1), Adjacency Matrix (2), Degree Matrix (3), Laplacian Matrix (4)
  • Figure 4: The effect of missing data on (left) the number of non-zero eigenvalues, and (right) the number of connected components detected by the number of zero eigenvalues.
  • Figure 5: The effect of missing data on the number of communities detected by left to right and from top to bottom (1) Clauset-Newman-Moore algorithm, (2) Louvain community detection algorithm, (3) label propagation algorithm, and (4) Girvan-Newman algorithm.
  • ...and 2 more figures