Table of Contents
Fetching ...

Stochastic Degree Sequence Model with Edge Constraints (SDSM-EC) for Backbone Extraction

Zachary P. Neal, Jennifer Watling Neal

TL;DR

The paper tackles bias in projected bipartite networks by focusing on backbone extraction under appropriate null models. It introduces SDSM-EC, an extension of the Stochastic Degree Sequence Model that enforces edge constraints (prohibited edges) in the null space, with $Q'_{ik}$ computed to exclude impossible cell configurations. Through toy and empirical preschool data, SDSM-EC demonstrates sparser backbones by omitting edges that would appear significant under unconstrained nulls, highlighting the importance of correct constraint incorporation. The method is implemented in the backbone R package (sdsm()), and the authors discuss extensions to required edges and faster $Q'-$estimation, underscoring the practical impact for bias-free backbone analysis in constrained bipartite settings.

Abstract

It is common to use the projection of a bipartite network to measure a unipartite network of interest. For example, scientific collaboration networks are often measured using a co-authorship network, which is the projection of a bipartite author-paper network. Caution is required when interpreting the edge weights that appear in such projections. However, backbone models offer a solution by providing a formal statistical method for evaluating when an edge in a projection is statistically significantly strong. In this paper, we propose an extension to the existing Stochastic Degree Sequence Model (SDSM) that allows the null model to include edge constraints (EC) such as prohibited edges. We demonstrate the new SDSM-EC in toy data and empirical data on young children's' play interactions, illustrating how it correctly omits noisy edges from the backbone.

Stochastic Degree Sequence Model with Edge Constraints (SDSM-EC) for Backbone Extraction

TL;DR

The paper tackles bias in projected bipartite networks by focusing on backbone extraction under appropriate null models. It introduces SDSM-EC, an extension of the Stochastic Degree Sequence Model that enforces edge constraints (prohibited edges) in the null space, with computed to exclude impossible cell configurations. Through toy and empirical preschool data, SDSM-EC demonstrates sparser backbones by omitting edges that would appear significant under unconstrained nulls, highlighting the importance of correct constraint incorporation. The method is implemented in the backbone R package (sdsm()), and the authors discuss extensions to required edges and faster estimation, underscoring the practical impact for bias-free backbone analysis in constrained bipartite settings.

Abstract

It is common to use the projection of a bipartite network to measure a unipartite network of interest. For example, scientific collaboration networks are often measured using a co-authorship network, which is the projection of a bipartite author-paper network. Caution is required when interpreting the edge weights that appear in such projections. However, backbone models offer a solution by providing a formal statistical method for evaluating when an edge in a projection is statistically significantly strong. In this paper, we propose an extension to the existing Stochastic Degree Sequence Model (SDSM) that allows the null model to include edge constraints (EC) such as prohibited edges. We demonstrate the new SDSM-EC in toy data and empirical data on young children's' play interactions, illustrating how it correctly omits noisy edges from the backbone.
Paper Structure (10 sections, 2 equations, 3 figures)

This paper contains 10 sections, 2 equations, 3 figures.

Figures (3)

  • Figure 1: (A) The cardinality of the space of matrices with row sums {1,1,2,2} and column sums {1,1,2,2} and one or two cells constrained to zero is small compared to the cardinality of the space without constrained cells. (B) The deviation between the true and estimated $Q_{ik}$ for all such constrained spaces tends to be small.
  • Figure 2: (A) A bipartite network containing two groups of agents and two groups of artifacts, such that agents are connected only to their own group's artifacts. (B) The SDSM backbone of a projection of this bipartite graph, which assumes that an agent could be connected to another group's artifact, suggests within-group cohesion among agents. (C) The SDSM-EC projection, which assumes that an agent could not be connected to another group's artifact, suggests none of the edges in the projection are significant.
  • Figure 3: (A) Backbone extracted using SDSM and (B) SDSM-EC from 1829 observations of 53 preschool childrens' play groups. Vertex shape represents age-based classrooms: circles = 3 year old classroom, squares = 4 year old classroom. Vertex color represents attendance status: black = full day, gray = AM only, white = PM only.