Table of Contents
Fetching ...

Scalable Deep Metric Learning on Attributed Graphs

Xiang Li, Gagan Agrawal, Ruoming Jin, Rajiv Ramnath

TL;DR

A graph embedding method based on extending deep metric and unbiased contrastive learning techniques to work with attributed graphs, enabling a mini-batch based approach, and achieving scalability is developed, which shows high scalability of representation construction.

Abstract

We consider the problem of constructing embeddings of large attributed graphs and supporting multiple downstream learning tasks. We develop a graph embedding method, which is based on extending deep metric and unbiased contrastive learning techniques to 1) work with attributed graphs, 2) enabling a mini-batch based approach, and 3) achieving scalability. Based on a multi-class tuplet loss function, we present two algorithms -- DMT for semi-supervised learning and DMAT-i for the unsupervised case. Analyzing our methods, we provide a generalization bound for the downstream node classification task and for the first time relate tuplet loss to contrastive learning. Through extensive experiments, we show high scalability of representation construction, and in applying the method for three downstream tasks (node clustering, node classification, and link prediction) better consistency over any single existing method.

Scalable Deep Metric Learning on Attributed Graphs

TL;DR

A graph embedding method based on extending deep metric and unbiased contrastive learning techniques to work with attributed graphs, enabling a mini-batch based approach, and achieving scalability is developed, which shows high scalability of representation construction.

Abstract

We consider the problem of constructing embeddings of large attributed graphs and supporting multiple downstream learning tasks. We develop a graph embedding method, which is based on extending deep metric and unbiased contrastive learning techniques to 1) work with attributed graphs, 2) enabling a mini-batch based approach, and 3) achieving scalability. Based on a multi-class tuplet loss function, we present two algorithms -- DMT for semi-supervised learning and DMAT-i for the unsupervised case. Analyzing our methods, we provide a generalization bound for the downstream node classification task and for the first time relate tuplet loss to contrastive learning. Through extensive experiments, we show high scalability of representation construction, and in applying the method for three downstream tasks (node clustering, node classification, and link prediction) better consistency over any single existing method.

Paper Structure

This paper contains 37 sections, 10 theorems, 41 equations, 6 figures, 9 tables, 1 algorithm.

Key Result

Lemma 1

For any embedding $f$, given the same size of tuplets sharing one positive sample ${x_{0}^{+}}$, i.e. $(x, x_{0}^+,\{x^-_i\}_{i=1}^{N-1})$ for $L_{\textnormal{Unbiased}}^{\mathcal{N}+1}$ and $(x, x_{0}^{+}, \{x^+_i\}_{i=1}^m, \{x^-_i\}_{i=1}^q )$ for $L_{\textnormal{DM(A)T}}^{\textnormal{m,q}}$, we

Figures (6)

  • Figure 1: Schematic of DMAT-i architecture. The graph filter generates smoothed node attributes $\mathcal{X}$ by incorporating graph structural information. A pair of views ($H_{1}, H_{2}$) of $\mathcal{X}$ are produced by augmentation and fed to the subsequent encoder $f$ to generate latent representations $U=f(H_{1})$ and $V=f(H_{2})$. Metric distance measurement is performed on $U \bigcup V$. For each sample $x \in U$, its counterpart $\bar{x} \in V$ is the only recognizable positive sample.
  • Figure 2: Scalability of Different Frameworks: Training Time vs. No. of Nodes in Graph
  • Figure 3: Cora t-SNE for different embeddings. Each color represents a distinct class.
  • Figure 4: DMAT-i Training Process on Coauthor PHY
  • Figure 5: Empirical evaluation of $\tau^{0}$ distribution across 8 datasets.
  • ...and 1 more figures

Theorems & Definitions (15)

  • Lemma 1
  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof : Proof of Lemma \ref{['lemma: app DMAT inequality']}
  • Lemma A.1
  • proof : Proof of Lemma \ref{['lemma: bound Delta']}
  • Theorem 1
  • proof
  • Lemma A.2
  • ...and 5 more