Table of Contents
Fetching ...

Detecting Groups in Directed and Non-Directed Bipartite Networks

Alexandre Benatti, Luciano da F. Costa

TL;DR

This paper addresses detecting groups in directed and non-directed bipartite networks by leveraging a coincidence similarity index to transform bipartite representations into similarity networks whose connectivity reveals clusters. It extends the coincidence similarity index, defining $C(A,B)=J(A,B)I(A,B)$ with $J( vec{x}, vec{y})=\frac{\sum_i \min\{x_i,y_i\}}{\sum_i \max\{x_i,y_i\}}$ and $I( vec{x},\nvec{y})=\frac{\sum_i \min\{x_i,y_i\}}{\min\{\sum_i x_i,\sum_i y_i\}}$, to transform bipartite data into coincidence similarity networks that expose community structure. The authors generate synthetic modular bipartite networks with controllable numbers of groups and overlap via a scrambling probability $p$, and demonstrate that direct representations with more features yield the best group separation, outperforming projections with respect to modularity-based metrics. The results show robust separation even under substantial overlap and provide a scalable framework for evaluating group detection in bipartite systems, with avenues for extending to multipartite networks, variable group sizes, and node-subset analyses.

Abstract

Bipartite networks provide an effective resource for representing, characterizing, and modeling several abstract and real-world systems and structures involving binary relations, which include food webs, social interactions, and customer-product relationships. Of particular interest is the problem of, given a specific bipartite network, to identify possible respective groups or clusters characterized by similar interconnecting patterns. The present work approaches this issue by extending and complementing a previously described coincidence similarity methodology (Bioarxiv, doi.org/10.1101/2022.07.16.500294) in several manners, including the consideration of direct and non-directed bipartite networks, the characterization of groups in those networks, as well as considering synthetic bipartite networks presenting groups as a resource for studying the performance of the described methodology. Several interesting results are described and discussed, including the corroboration of the potential of the coincidence similarity methodology for achieving enhanced separation between the groups in bipartite networks.

Detecting Groups in Directed and Non-Directed Bipartite Networks

TL;DR

This paper addresses detecting groups in directed and non-directed bipartite networks by leveraging a coincidence similarity index to transform bipartite representations into similarity networks whose connectivity reveals clusters. It extends the coincidence similarity index, defining with and , to transform bipartite data into coincidence similarity networks that expose community structure. The authors generate synthetic modular bipartite networks with controllable numbers of groups and overlap via a scrambling probability , and demonstrate that direct representations with more features yield the best group separation, outperforming projections with respect to modularity-based metrics. The results show robust separation even under substantial overlap and provide a scalable framework for evaluating group detection in bipartite systems, with avenues for extending to multipartite networks, variable group sizes, and node-subset analyses.

Abstract

Bipartite networks provide an effective resource for representing, characterizing, and modeling several abstract and real-world systems and structures involving binary relations, which include food webs, social interactions, and customer-product relationships. Of particular interest is the problem of, given a specific bipartite network, to identify possible respective groups or clusters characterized by similar interconnecting patterns. The present work approaches this issue by extending and complementing a previously described coincidence similarity methodology (Bioarxiv, doi.org/10.1101/2022.07.16.500294) in several manners, including the consideration of direct and non-directed bipartite networks, the characterization of groups in those networks, as well as considering synthetic bipartite networks presenting groups as a resource for studying the performance of the described methodology. Several interesting results are described and discussed, including the corroboration of the potential of the coincidence similarity methodology for achieving enhanced separation between the groups in bipartite networks.
Paper Structure (10 sections, 9 equations, 13 figures, 3 tables)

This paper contains 10 sections, 9 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Example of a non-directed, weighted bipartite network involving 5 nodes of type $A$ and 8 nodes of type $B$. Interconnections are only possible between nodes of distinct types. The width of each interconnections indicate the respective weight. Except for incorporating two type of nodes, this structure can actually be understood as a classic non-directed, weighted graph with 13 nodes.
  • Figure 2: The non-directed network in Fig. \ref{['fig:bipartite']} can be understood in terms of directed bipartite networks corresponding to direct (a) and reverse (b) representations. The latter type of representation is only applicable in the case of non-directed bipartite networks.
  • Figure 3: The projections of the bipartite network shown in Fig. \ref{['fig:bipartite']} into the $A$ and $B$ nodes, shown respectively in (a) and (b). These results do not take into account the weights of the original bipartite network.
  • Figure 4: The bipartite network in Fig. \ref{['fig:bipartite']} with nodes visualized irrespectively to their types. It is interesting to observe the markedly distinct perception of the same bipartite network motivated by these two alternative representations.
  • Figure 5: A non-directed, weighted bipartite network presenting completely separated (no overlap of interconnections) clusters of nodes (a), and the same network represented with nodes in arbitrary order. Typically, a network to be studied is given as in (b), and cluster identification methods need to be applied in order to obtain representations with identified clusters as that illustrated in (a).
  • ...and 8 more figures