Table of Contents
Fetching ...

Graph matching based on similarities in structure and attributes

Raphaël Candelier

TL;DR

The Graph Attributes and Structure Matching (GASM) algorithm is presented, which provides high-quality solutions by integrating all the available information in a unified framework and consistently finds as-good-as or better solutions than state-of-the-art algorithms, with similar processing times.

Abstract

Finding vertex-to-vertex correspondences in real-world graphs is a challenging task with applications in a wide variety of domains. Structural matching based on graphs connectivities has attracted considerable attention, while the integration of all the other information stemming from vertices and edges attributes has been mostly left aside. Here we present the Graph Attributes and Structure Matching (GASM) algorithm, which provides high-quality solutions by integrating all the available information in a unified framework. Parameters quantifying the reliability of the attributes can tune how much the solutions should rely on the structure or on the attributes. We further show that even without attributes GASM consistently finds as-good-as or better solutions than state-of-the-art algorithms, with similar processing times.

Graph matching based on similarities in structure and attributes

TL;DR

The Graph Attributes and Structure Matching (GASM) algorithm is presented, which provides high-quality solutions by integrating all the available information in a unified framework and consistently finds as-good-as or better solutions than state-of-the-art algorithms, with similar processing times.

Abstract

Finding vertex-to-vertex correspondences in real-world graphs is a challenging task with applications in a wide variety of domains. Structural matching based on graphs connectivities has attracted considerable attention, while the integration of all the other information stemming from vertices and edges attributes has been mostly left aside. Here we present the Graph Attributes and Structure Matching (GASM) algorithm, which provides high-quality solutions by integrating all the available information in a unified framework. Parameters quantifying the reliability of the attributes can tune how much the solutions should rely on the structure or on the attributes. We further show that even without attributes GASM consistently finds as-good-as or better solutions than state-of-the-art algorithms, with similar processing times.
Paper Structure (30 sections, 27 equations, 13 figures)

This paper contains 30 sections, 27 equations, 13 figures.

Figures (13)

  • Figure 1: Exemple of matching degeneracy introduced by the score matrix. a) The graphs to match. b) Form of the score matrix returned by any algorithm exploiting the graph structure. The best matching solutions are composed of the grayed cells exclusively. c) The 4 matchings with a maximum total score of $2(a+c)$, along with their structural quality $q_S$. d) The only 2 matchings respecting structural correspondence.
  • Figure 2: Managing symmetry with a minute noise. a) Self-matching ($G_A = G_B$) of a simple directed graph with a symmetry. b) Score matrix $X_{\tilde{k}=2}$ given by the algorithm of Zager et al.Zager_2008, without normalization. The matching solutions comprise only the grayed cells. c) The 4 corresponding matchings with the best score ($48$), along with their accuracy $\gamma$ and structural quality $q_S$. d) Exemple of score matrix $X_{\tilde{k}=2}$ produced by the GASM algorithm, without normalization and with a relatively large noise $\eta=10^{-2}$ to ease visualization. Here the best matching solution lies in the green cells, but other initial random numbers could favor the orange cells. e) The 2 corresponding matchings solutions.
  • Figure 3: Propagation of attribute information through branches. a) Self-matching ($G_A = G_B$) of a simple directed branched graph with a categorical attribute on vertices. One vertex has a different value than the others, symbolized by a red square. b) Score matrix $X_{\tilde{k}=2}$ given by the Zager algorithm, without normalization. The matching solutions comprise only the grayed cells. c) The 2 corresponding matchings solutions with the best score ($48$), along with their accuracy $\gamma$ and structural quality $q_S$. d) Integer part of the score matrix $X_{\tilde{k}=2}$ produced by the GASM algorithm, without normalization. The decimal part, due to the artificial noise, is neglectible for the matching and is skipped to ease visualization. e) The corresponding matching solution.
  • Figure 4: Managing intrinsic indeterminacies over attributes. a) The two graphs share the same structure but one vertex categorical attribute differs, symbolized by a red square. b) Score matrix $X_{\tilde{k}=2}$ given by the Zager algorithm, without normalization. The matching solutions comprise only the grayed cells. c) The 8 corresponding matchings solutions with the best score ($36$), along with their accuracy $\gamma$ and structural quality $q_S$. d) Exemple of score matrix $X_{\tilde{k}=2}$ produced by the GASM algorithm, without normalization and with a relatively large noise $\eta=10^{-2}$ to ease visualization. Here the best matching solution lies in the orange cells, but other initial random numbers could favor the green cells. e) The 2 corresponding matchings solutions.
  • Figure 5: Isomorphic matching of different types of graphs: a) balanced binary trees with depth $h$, b) star-branched with $k=3$ branches of length $\beta$, c) circular ladder with $2c$ vertices, and d) random Erdös-Rényi (ER) $G_{np}$ graphs with $n_A=20$ vertices and edge probability $p$. Top: Examples of each graph type. Middle: average accuracy $\gamma$ as a function of the graph parameters. Bottom: average structural quality $q_s$ computed over the same graphs. Colors are consistant in all panels. Each data point is averaged over $10^4$ samples, except for the balanced binary tree where it is variable with $h$ in order to keep a reasonnable computation time ; the data points for the 2opt algorithm are missing when $h>8$ due to a prohibitive computation time.
  • ...and 8 more figures