MultiGraphMatch: a subgraph matching algorithm for multigraphs
Giovanni Micale, Antonio Di Maria, Roberto Grasso, Vincenzo Bonnici, Alfredo Ferro, Dennis Shasha, Rosalba Giugno, Alfredo Pulvirenti
TL;DR
MultiGraphMatch addresses submultigraph matching in labeled attributed multigraphs by introducing a bit signature matrix to index target edges and a compatibility-domain–driven, order-aware matching strategy. It integrates symmetry-breaking via NAUTY, and handles CYPHER-based queries, including WHERE clauses, by transforming constraints into efficient pre-filtering and time-saving search-time checks. The approach yields substantial speedups over state-of-the-art systems across synthetic and real networks, with ablation studies confirming the critical roles of the bit matrix and the edge-processing order. The work demonstrates scalable performance, linear/near-linear indexing and filtering costs, and robust applicability to large-scale graphs, making it practical for real-world multigraph querying in various domains.
Abstract
Subgraph matching is the problem of finding all the occurrences of a small graph, called the query, in a larger graph, called the target. Although the problem has been widely studied in simple graphs, few solutions have been proposed for multigraphs, in which two nodes can be connected by multiple edges, each denoting a possibly different type of relationship. In our new algorithm MultiGraphMatch, nodes and edges can be associated with labels and multiple properties. MultiGraphMatch introduces a novel data structure called bit matrix to efficiently index both the query and the target and filter the set of target edges that are matchable with each query edge. In addition, the algorithm proposes a new technique for ordering the processing of query edges based on the cardinalities of the sets of matchable edges. Using the CYPHER query definition language, MultiGraphMatch can perform queries with logical conditions on node and edge labels. We compare MultiGraphMatch with SuMGra and graph database systems Memgraph and Neo4J, showing comparable or better performance in all queries on a wide variety of synthetic and real-world graphs.
