Table of Contents
Fetching ...

CombAlign: Enhancing Model Expressiveness in Unsupervised Graph Alignment

Songyang Chen, Yu Liu, Lei Zou, Zexuan Wang, Youfang Lin

TL;DR

This work tackles unsupervised graph alignment by examining model expressiveness and proposing a hybrid CombAlign framework that unites OT-based GW learning with embedding-based (WL-inspired) representations and non-uniform marginals. CombAlign uses a GW-driven module (GRAFT) with cross-dimensional feature interaction, a parameter-free WL module as a prior, and a NUM module to inject WL-informed priors into GW optimization, followed by a Combine module that casts the final alignment as a maximum weight matching problem to guarantee one-to-one and mutual alignment. Theoretical analysis demonstrates that feature transformation and non-uniform marginals enhance discriminative power for matching; convergence of the GW learning process is established under suitable conditions. Empirically, CombAlign delivers substantial gains (notably around 14.5% higher alignment accuracy) over state-of-the-art methods across six datasets, with ablations confirming the contribution of each module and practical scalability and efficiency. The approach offers a principled, expressive, and scalable pathway for unsupervised graph alignment with strong theoretical and empirical support.

Abstract

Unsupervised graph alignment finds the node correspondence between a pair of attributed graphs by only exploiting graph structure and node features. One category of recent studies first computes the node representation and then matches nodes with the largest embedding-based similarity, while the other category reduces the problem to optimal transport (OT) via Gromov-Wasserstein learning. However, it remains largely unexplored in the model expressiveness, as well as how theoretical expressivity impacts prediction accuracy. We investigate the model expressiveness from two aspects. First, we characterize the model's discriminative power in distinguishing matched and unmatched node pairs across two graphs. Second, we study the model's capability of guaranteeing node matching properties such as one-to-one matching and mutual alignment. Motivated by our theoretical analysis, we put forward a hybrid approach named CombAlign with stronger expressive power. Specifically, we enable cross-dimensional feature interaction for OT-based learning and propose an embedding-based method inspired by the Weisfeiler-Lehman test. We also apply non-uniform marginals obtained from the embedding-based modules to OT as priors for more expressiveness. Based on that, we propose a traditional algorithm-based refinement, which combines our OT and embedding-based predictions using the ensemble learning strategy and reduces the problem to maximum weight matching. With carefully designed edge weights, we ensure those matching properties and further enhance prediction accuracy. By extensive experiments, we demonstrate a significant improvement of 14.5% in alignment accuracy compared to state-of-the-art approaches and confirm the soundness of our theoretical analysis.

CombAlign: Enhancing Model Expressiveness in Unsupervised Graph Alignment

TL;DR

This work tackles unsupervised graph alignment by examining model expressiveness and proposing a hybrid CombAlign framework that unites OT-based GW learning with embedding-based (WL-inspired) representations and non-uniform marginals. CombAlign uses a GW-driven module (GRAFT) with cross-dimensional feature interaction, a parameter-free WL module as a prior, and a NUM module to inject WL-informed priors into GW optimization, followed by a Combine module that casts the final alignment as a maximum weight matching problem to guarantee one-to-one and mutual alignment. Theoretical analysis demonstrates that feature transformation and non-uniform marginals enhance discriminative power for matching; convergence of the GW learning process is established under suitable conditions. Empirically, CombAlign delivers substantial gains (notably around 14.5% higher alignment accuracy) over state-of-the-art methods across six datasets, with ablations confirming the contribution of each module and practical scalability and efficiency. The approach offers a principled, expressive, and scalable pathway for unsupervised graph alignment with strong theoretical and empirical support.

Abstract

Unsupervised graph alignment finds the node correspondence between a pair of attributed graphs by only exploiting graph structure and node features. One category of recent studies first computes the node representation and then matches nodes with the largest embedding-based similarity, while the other category reduces the problem to optimal transport (OT) via Gromov-Wasserstein learning. However, it remains largely unexplored in the model expressiveness, as well as how theoretical expressivity impacts prediction accuracy. We investigate the model expressiveness from two aspects. First, we characterize the model's discriminative power in distinguishing matched and unmatched node pairs across two graphs. Second, we study the model's capability of guaranteeing node matching properties such as one-to-one matching and mutual alignment. Motivated by our theoretical analysis, we put forward a hybrid approach named CombAlign with stronger expressive power. Specifically, we enable cross-dimensional feature interaction for OT-based learning and propose an embedding-based method inspired by the Weisfeiler-Lehman test. We also apply non-uniform marginals obtained from the embedding-based modules to OT as priors for more expressiveness. Based on that, we propose a traditional algorithm-based refinement, which combines our OT and embedding-based predictions using the ensemble learning strategy and reduces the problem to maximum weight matching. With carefully designed edge weights, we ensure those matching properties and further enhance prediction accuracy. By extensive experiments, we demonstrate a significant improvement of 14.5% in alignment accuracy compared to state-of-the-art approaches and confirm the soundness of our theoretical analysis.
Paper Structure (23 sections, 6 theorems, 11 equations, 9 figures, 8 tables, 6 algorithms)

This paper contains 23 sections, 6 theorems, 11 equations, 9 figures, 8 tables, 6 algorithms.

Key Result

Theorem 3.1

We are given the graph structures $\mathbf{A}_s, \mathbf{A}_t$ and node features $\mathbf{X}_s, \mathbf{X}_t$ as input. Denote feature propagation as $\mathbf{R}_p = g(\mathbf{A}_p) \mathbf{X}_p, p=s,t$, where $g(\cdot)$ is a function without learnable parameters. Denote the additional linear transf

Figures (9)

  • Figure 1: Illustration of model expressiveness for graph alignment, where (a) denotes the model's capability of separating matched and unmatched node pairs, while (b) & (c) showcase two disadvantages of pure learning-based approaches, i.e., one-to-many prediction regardless of the one-to-one matching constraint, and the inconsistency between row and column-wise alignment.
  • Figure 2: The overall framework of CombAlign. Modules in orange, green, and purple belong to the embedding-based, OT-based, and traditional algorithm-based approaches, respectively.
  • Figure 3: Illustration of the necessity of non-uniform marginals.
  • Figure 4: Percentage of one-to-many predictions and inconsistencies in terms of mutual alignment for representative baselines.
  • Figure 5: Ablation study on three real-world datasets.
  • ...and 4 more figures

Theorems & Definitions (14)

  • Definition 1: Unsupervised Graph Alignment
  • Definition 2: Gromov-Wasserstein Discrepancy (GWD) gwl
  • Theorem 3.1
  • proof
  • Corollary 3.2
  • proof
  • Corollary 3.3
  • proof
  • Theorem 3.4
  • proof
  • ...and 4 more