Table of Contents
Fetching ...

Community structure in social and biological networks

Michelle Girvan, M. E. J. Newman

TL;DR

<3-5 sentence high-level summary> Networks exhibit common properties such as small-world behavior, skewed degree distributions, and clustering; the paper highlights community structure as tightly knit groups with sparser intergroup links and introduces an edge-betweenness–based method to detect such communities. The method removes high-betweenness edges to reveal community boundaries, recalculating betweenness after each removal, and runs in worst-case O(m^2 n). It is validated on computer-generated graphs with known partitions and on real networks (Zachary's karate club, college football) where it shows high accuracy. The authors apply the method to a collaboration network and a marine food web to uncover meaningful, interpretable divisions, and discuss extensions and scalability challenges for large or dense networks.

Abstract

A number of recent studies have focused on the statistical properties of networked systems such as social networks and the World-Wide Web. Researchers have concentrated particularly on a few properties which seem to be common to many networks: the small-world property, power-law degree distributions, and network transitivity. In this paper, we highlight another property which is found in many networks, the property of community structure, in which network nodes are joined together in tightly-knit groups between which there are only looser connections. We propose a new method for detecting such communities, built around the idea of using centrality indices to find community boundaries. We test our method on computer generated and real-world graphs whose community structure is already known, and find that it detects this known structure with high sensitivity and reliability. We also apply the method to two networks whose community structure is not well-known - a collaboration network and a food web - and find that it detects significant and informative community divisions in both cases.

Community structure in social and biological networks

TL;DR

<3-5 sentence high-level summary> Networks exhibit common properties such as small-world behavior, skewed degree distributions, and clustering; the paper highlights community structure as tightly knit groups with sparser intergroup links and introduces an edge-betweenness–based method to detect such communities. The method removes high-betweenness edges to reveal community boundaries, recalculating betweenness after each removal, and runs in worst-case O(m^2 n). It is validated on computer-generated graphs with known partitions and on real networks (Zachary's karate club, college football) where it shows high accuracy. The authors apply the method to a collaboration network and a marine food web to uncover meaningful, interpretable divisions, and discuss extensions and scalability challenges for large or dense networks.

Abstract

A number of recent studies have focused on the statistical properties of networked systems such as social networks and the World-Wide Web. Researchers have concentrated particularly on a few properties which seem to be common to many networks: the small-world property, power-law degree distributions, and network transitivity. In this paper, we highlight another property which is found in many networks, the property of community structure, in which network nodes are joined together in tightly-knit groups between which there are only looser connections. We propose a new method for detecting such communities, built around the idea of using centrality indices to find community boundaries. We test our method on computer generated and real-world graphs whose community structure is already known, and find that it detects this known structure with high sensitivity and reliability. We also apply the method to two networks whose community structure is not well-known - a collaboration network and a food web - and find that it detects significant and informative community divisions in both cases.

Paper Structure

This paper contains 12 sections, 2 equations, 7 figures.

Figures (7)

  • Figure 1: A schematic representation of a network with community structure. In this network there are three communities of densely connected vertices (circles with solid lines), with a much lower density of connections (gray lines) between them.
  • Figure 2: An example of a small hierarchical clustering tree. The circles at the bottom of the figure represent the vertices in the network and the tree shows the order in which they join together to form communities for a given definition of the weight $W_{ij}$ of connections between vertex pairs.
  • Figure 3: The fraction of vertices correctly classified by our method as the number $z_{\rm out}$ of inter-community edges per vertex is varied, for computer generated graphs of the type described in the text. The measurements with half-integer values $z_{\rm out}=k+\hbox{$\frac{1}{2}$}$ are for graphs in which half the vertices had $k$ inter-community connections and half had $k+1$. Each point is an average over 100 realization of the graphs. Lines between points are included solely as a guide to the eye.
  • Figure 4: (a) The friendship network from Zachary's karate club study Zachary77, as described in the text. Nodes associated with the club administrator's faction are drawn as circles, while those associated with the instructor's faction are drawn as squares. (b) The hierarchical tree showing the complete community structure for the network. The initial split of the network into two groups is in agreement with the actual factions observed by Zachary, with the exception that node 3 is misclassified.
  • Figure 5: Hierarchical tree for the network reflecting the schedule of regular season Division I college football games for year 2000. Nodes in the network represent teams and edges represent games between teams. Our algorithm identifies nearly all the conference structure in the network.
  • ...and 2 more figures