Table of Contents
Fetching ...

When can networks be inferred from observed groups?

Zachary P. Neal

TL;DR

This work tackles the problem of inferring an unobserved undirected network from observed group memberships. It employs a factorial simulation framework across 10 unobserved topologies, 6 group-count levels, 6 clique-match probabilities, and 2 inference methods (unweighted projection and SDSM backbone), to quantify reconstruction accuracy via $r$. Key findings show two regimes: with few observed groups, a simple unweighted projection is effective when group co-memberships closely form cliques (high $p$); with many observed groups, the SDSM backbone remains accurate even when clique alignment is weaker (lower $p$). The study provides practical scope conditions and cautions for indirect network measurement, and points to methodological avenues like Bayesian backbones and empirical validation for future work.

Abstract

Collecting network data directly from network members can be challenging. One alternative involves inferring a network from observed groups, for example, inferring a network of scientific collaboration from researchers' observed paper authorships. In this paper, I explore when an unobserved undirected network of interest can accurately be inferred from observed groups. The analysis uses simulations to experimentally manipulate the structure of the unobserved network to be inferred, the number of groups observed, the extent to which the observed groups correspond to cliques in the unobserved network, and the method used to draw inferences. I find that when a small number of groups are observed, an unobserved network can be accurately inferred using a simple unweighted two-mode projection, provided that each group's membership closely corresponds to a clique in the unobserved network. In contrast, when a large number of groups are observed, an unobserved network can be accurately inferred using a statistical backbone extraction model, even if the groups' memberships are mostly random. These findings offer guidance for researchers seeking to indirectly measure a network of interest using observations of groups.

When can networks be inferred from observed groups?

TL;DR

This work tackles the problem of inferring an unobserved undirected network from observed group memberships. It employs a factorial simulation framework across 10 unobserved topologies, 6 group-count levels, 6 clique-match probabilities, and 2 inference methods (unweighted projection and SDSM backbone), to quantify reconstruction accuracy via . Key findings show two regimes: with few observed groups, a simple unweighted projection is effective when group co-memberships closely form cliques (high ); with many observed groups, the SDSM backbone remains accurate even when clique alignment is weaker (lower ). The study provides practical scope conditions and cautions for indirect network measurement, and points to methodological avenues like Bayesian backbones and empirical validation for future work.

Abstract

Collecting network data directly from network members can be challenging. One alternative involves inferring a network from observed groups, for example, inferring a network of scientific collaboration from researchers' observed paper authorships. In this paper, I explore when an unobserved undirected network of interest can accurately be inferred from observed groups. The analysis uses simulations to experimentally manipulate the structure of the unobserved network to be inferred, the number of groups observed, the extent to which the observed groups correspond to cliques in the unobserved network, and the method used to draw inferences. I find that when a small number of groups are observed, an unobserved network can be accurately inferred using a simple unweighted two-mode projection, provided that each group's membership closely corresponds to a clique in the unobserved network. In contrast, when a large number of groups are observed, an unobserved network can be accurately inferred using a statistical backbone extraction model, even if the groups' memberships are mostly random. These findings offer guidance for researchers seeking to indirectly measure a network of interest using observations of groups.
Paper Structure (9 sections, 3 figures, 2 tables)

This paper contains 9 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Relationship between an unobserved network, observed groups, and inferred network. Accuracy may depend on (A) the structure of the unobserved network, (B) the number of observed groups, (C) the extent to which observed group correspond to cliques in the unobserved network, and (D) the method used to infer network relationships from group memberships.
  • Figure 2: Accuracy of a network inferred from observed groups using an unweighted projection, by (a) the structure of the unobserved network being inferred, (b) number of groups observed, and (c) extent to which the observed groups correspond to cliques. Accuracy is measured using the correlation between the unobserved and inferred networks.
  • Figure 3: Accuracy of a network inferred from observed groups using a backbone extracted with the stochastic degree sequence model, by (a) the structure of the unobserved network being inferred, (b) number of groups observed, and (c) extent to which the observed groups correspond to cliques. Accuracy is measured using the correlation between the unobserved and inferred networks.