Table of Contents
Fetching ...

Attributed Graph Alignment

Ning Zhang, Ziao Wang, Weina Wang, Lele Wang

TL;DR

This work introduces the attributed Erdős–Rényi pair model $\mathcal{G}(n,\bm{p};m,\bm{q})$ to study graph alignment with publicly available side information. It derives information-theoretic achievability and converse results for exact vertex alignment, showing how attribute information can reduce the required topology/attribute similarity for recovery and yielding a spectrum of regimes spanning topology-only to attribute-only. The results unify and extend classic models (ER graph pair, seeded ER, and bipartite graph alignment) and provide explicit phase-transition conditions via the quantities $\psi_{\mathrm{u}}$, $\psi_{\mathrm{a}}$, and their sum with $\log n$. The findings quantify when exact alignment is possible and highlight the potential practical impact of publicly available attributes in de-anonymization and related tasks, while also outlining avenues for efficient algorithms and broader attributed models.

Abstract

Motivated by various data science applications including de-anonymizing user identities in social networks, we consider the graph alignment problem, where the goal is to identify the vertex/user correspondence between two correlated graphs. Existing work mostly recovers the correspondence by exploiting the user-user connections. However, in many real-world applications, additional information about the users, such as user profiles, might be publicly available. In this paper, we introduce the attributed graph alignment problem, where additional user information, referred to as attributes, is incorporated to assist graph alignment. We establish both the achievability and converse results on recovering vertex correspondence exactly, where the conditions match for certain parameter regimes. Our results span the full spectrum between models that only consider user-user connections and models where only attribute information is available.

Attributed Graph Alignment

TL;DR

This work introduces the attributed Erdős–Rényi pair model to study graph alignment with publicly available side information. It derives information-theoretic achievability and converse results for exact vertex alignment, showing how attribute information can reduce the required topology/attribute similarity for recovery and yielding a spectrum of regimes spanning topology-only to attribute-only. The results unify and extend classic models (ER graph pair, seeded ER, and bipartite graph alignment) and provide explicit phase-transition conditions via the quantities , , and their sum with . The findings quantify when exact alignment is possible and highlight the potential practical impact of publicly available attributes in de-anonymization and related tasks, while also outlining avenues for efficient algorithms and broader attributed models.

Abstract

Motivated by various data science applications including de-anonymizing user identities in social networks, we consider the graph alignment problem, where the goal is to identify the vertex/user correspondence between two correlated graphs. Existing work mostly recovers the correspondence by exploiting the user-user connections. However, in many real-world applications, additional information about the users, such as user profiles, might be publicly available. In this paper, we introduce the attributed graph alignment problem, where additional user information, referred to as attributes, is incorporated to assist graph alignment. We establish both the achievability and converse results on recovering vertex correspondence exactly, where the conditions match for certain parameter regimes. Our results span the full spectrum between models that only consider user-user connections and models where only attribute information is available.

Paper Structure

This paper contains 21 sections, 22 theorems, 171 equations, 2 figures, 1 table.

Key Result

Theorem 1

Consider the attributed Erdős--Rényi pair $\mathcal{G}(n,\bm{p};m,\bm{q})$. If then the MAP estimator achieves exact alignment w.h.p.

Figures (2)

  • Figure 1: Example of attributed Erdős--Rényi graph pair: Graph $G_1$ and $G_2$ are generated on the same set of vertices. Anonymized graph $G_2'$ is obtained through applying $\Pi^* = (1)(2,3)$ only on $\mathcal{V}_{\mathrm{a}}$ of $G_2$ (permutation $\Pi^*$ is written in cycle notation).
  • Figure 2: The green region in the figure is information theoretically achievable and the shaded grey region is not achievable. The three lines in the figure represent three specialized settings: the blue line (correlated Erdős--Rényi model) is obtained by setting $q_{00}=1$; the yellow line (seeded Erdős--Rényi model) is obtained by setting $\bm{p}=\bm{q}$; the red line (correlated bipartite model) is obtained by setting $p_{00}=1$. Their intersections with the achievable and non-achievable region give the information-theoretic limits of the correlated Erdős--Rényi model, seeded Erdős--Rényi model and the correlated bipartite model separately.

Theorems & Definitions (47)

  • Theorem 1: General achievability
  • Theorem 2: Achievability in sparse region
  • Theorem 3: Converse
  • Corollary 1: Simplified achievability
  • Theorem 4: Best-known information theoretic limits settling-TITCul-Kiy-exact2017
  • Remark 1
  • Theorem 5: Specialization from attributed Erdős--Rényi pair
  • Remark 2
  • Theorem 6: Best-known information-theoretic limits in the sparse and symmetric regime Cul-Kiy-exact2017Mos-Xu-seeded2020wang-2022-feasible
  • Remark 3: Efficient algorithms for seeded graph alignment
  • ...and 37 more