Table of Contents
Fetching ...

MultiGraphMatch: a subgraph matching algorithm for multigraphs

Giovanni Micale, Antonio Di Maria, Roberto Grasso, Vincenzo Bonnici, Alfredo Ferro, Dennis Shasha, Rosalba Giugno, Alfredo Pulvirenti

TL;DR

MultiGraphMatch addresses submultigraph matching in labeled attributed multigraphs by introducing a bit signature matrix to index target edges and a compatibility-domain–driven, order-aware matching strategy. It integrates symmetry-breaking via NAUTY, and handles CYPHER-based queries, including WHERE clauses, by transforming constraints into efficient pre-filtering and time-saving search-time checks. The approach yields substantial speedups over state-of-the-art systems across synthetic and real networks, with ablation studies confirming the critical roles of the bit matrix and the edge-processing order. The work demonstrates scalable performance, linear/near-linear indexing and filtering costs, and robust applicability to large-scale graphs, making it practical for real-world multigraph querying in various domains.

Abstract

Subgraph matching is the problem of finding all the occurrences of a small graph, called the query, in a larger graph, called the target. Although the problem has been widely studied in simple graphs, few solutions have been proposed for multigraphs, in which two nodes can be connected by multiple edges, each denoting a possibly different type of relationship. In our new algorithm MultiGraphMatch, nodes and edges can be associated with labels and multiple properties. MultiGraphMatch introduces a novel data structure called bit matrix to efficiently index both the query and the target and filter the set of target edges that are matchable with each query edge. In addition, the algorithm proposes a new technique for ordering the processing of query edges based on the cardinalities of the sets of matchable edges. Using the CYPHER query definition language, MultiGraphMatch can perform queries with logical conditions on node and edge labels. We compare MultiGraphMatch with SuMGra and graph database systems Memgraph and Neo4J, showing comparable or better performance in all queries on a wide variety of synthetic and real-world graphs.

MultiGraphMatch: a subgraph matching algorithm for multigraphs

TL;DR

MultiGraphMatch addresses submultigraph matching in labeled attributed multigraphs by introducing a bit signature matrix to index target edges and a compatibility-domain–driven, order-aware matching strategy. It integrates symmetry-breaking via NAUTY, and handles CYPHER-based queries, including WHERE clauses, by transforming constraints into efficient pre-filtering and time-saving search-time checks. The approach yields substantial speedups over state-of-the-art systems across synthetic and real networks, with ablation studies confirming the critical roles of the bit matrix and the edge-processing order. The work demonstrates scalable performance, linear/near-linear indexing and filtering costs, and robust applicability to large-scale graphs, making it practical for real-world multigraph querying in various domains.

Abstract

Subgraph matching is the problem of finding all the occurrences of a small graph, called the query, in a larger graph, called the target. Although the problem has been widely studied in simple graphs, few solutions have been proposed for multigraphs, in which two nodes can be connected by multiple edges, each denoting a possibly different type of relationship. In our new algorithm MultiGraphMatch, nodes and edges can be associated with labels and multiple properties. MultiGraphMatch introduces a novel data structure called bit matrix to efficiently index both the query and the target and filter the set of target edges that are matchable with each query edge. In addition, the algorithm proposes a new technique for ordering the processing of query edges based on the cardinalities of the sets of matchable edges. Using the CYPHER query definition language, MultiGraphMatch can perform queries with logical conditions on node and edge labels. We compare MultiGraphMatch with SuMGra and graph database systems Memgraph and Neo4J, showing comparable or better performance in all queries on a wide variety of synthetic and real-world graphs.
Paper Structure (25 sections, 3 equations, 13 figures, 11 tables, 2 algorithms)

This paper contains 25 sections, 3 equations, 13 figures, 11 tables, 2 algorithms.

Figures (13)

  • Figure 1: Example of multigraph with actors and directors of the movie industry.
  • Figure 2: Example of sub-multigraph matching. There is only one occurrence of query $Q$ in target $T$. A subgraph matching algorithm for simple graphs needs to find all possible correct mappings between nodes of $Q$ and $T$, then post-process the results to find the correct mapping between query and target edge labels. By contrast, MultiGraphMatch can retrieve the correct node and edge mappings during the matching, without the need of post-processing.
  • Figure 3: Toy example of SubMultigraph Matching (SMM) with a query $Q$ and a target $T$. Colors in nodes and edges represent different labels or types. Target nodes $t_1$, $t_3$, $t_4$ and $t_5$ have two labels. Black dashed lines denote a possible node mapping, with query nodes $q_1$, $q_2$, $q_3$ and $q_4$ mapped to target nodes $t_1$, $t_4$, $t_3$ and $t_5$, respectively. Orange dotted lines depict a possible edge mapping, with query edges $w$, $x$, $y$ and $z$ mapped to target edges $f$, $d$, $b$ and $h$, respectively.
  • Figure 4: Indexing data structures associated to the target $T$ of Fig. \ref{['SIM_Example']}: a) Node label graph, b) Edge types map, c) Edge properties table, d) Bit signature matrix.
  • Figure 5: Example of symmetry breaking condition on nodes and edges and application of conditions on matching queries $Q'$ and $Q"$ with the target multigraph $T$ of Fig. \ref{['SIM_Example']}. a) In query $Q'$ nodes $q_2$ and $q_3$ are in the same orbit, thus yielding the breaking condition $q_2 \prec q_3$. Assuming that $t_2<t_3$ in $T$, after the application of such condition on $T$, solution $S'_2$ is discarded. b) In query $Q"$ edges $x$ and $y$ are in the same orbit, thus yielding the breaking condition $x \prec y$. Assuming that $d<e$ in $T$, after the application of such condition on $T$, solution $S"_2$ is discarded.
  • ...and 8 more figures

Theorems & Definitions (2)

  • definition 1
  • definition 2: SubMultigraph Matching (SMM)