Table of Contents
Fetching ...

TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations

Zheng Zhang, Yuntong Hu, Bo Pan, Chen Ling, Liang Zhao

TL;DR

TAGA tackles unsupervised representation learning for Text-Attributed Graphs by synergizing textual and structural information through two mutually informative views: Text-of-Graph (TofG) and Graph-of-Text (GoT). A Graph2Text module converts neighborhood structures into hierarchical documents while HDL preserves graph topology in text, and a GNN processes the Graph-of-Text view; these views are aligned with a hierarchical self-supervised loss $L = L_{positive} + L_{negative}$ to capture joint semantics. To scale to large TAGs, TAGA introduces a structure-preserving random walk that mimics human reading and reduces the computational burden of processing long text, enabling efficient training and inference. Empirically, TAGA achieves strong zero-shot and few-shot performance across eight real-world datasets, with substantial improvements over both graph-only pre-training and PLM baselines, and demonstrates robust transferability between domains. The combination of dual-view alignment, HDL, and efficient training makes TAGA a strong framework for universal TAG representations with practical, scalable utility.

Abstract

Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across diverse contexts. This paper introduces a new self-supervised learning framework, Text-And-Graph Multi-View Alignment (TAGA), which overcomes these constraints by integrating TAGs' structural and semantic dimensions. TAGA constructs two complementary views: Text-of-Graph view, which organizes node texts into structured documents based on graph topology, and the Graph-of-Text view, which converts textual nodes and connections into graph data. By aligning representations from both views, TAGA captures joint textual and structural information. In addition, a novel structure-preserving random walk algorithm is proposed for efficient training on large-sized TAGs. Our framework demonstrates strong performance in zero-shot and few-shot scenarios across eight real-world datasets.

TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations

TL;DR

TAGA tackles unsupervised representation learning for Text-Attributed Graphs by synergizing textual and structural information through two mutually informative views: Text-of-Graph (TofG) and Graph-of-Text (GoT). A Graph2Text module converts neighborhood structures into hierarchical documents while HDL preserves graph topology in text, and a GNN processes the Graph-of-Text view; these views are aligned with a hierarchical self-supervised loss to capture joint semantics. To scale to large TAGs, TAGA introduces a structure-preserving random walk that mimics human reading and reduces the computational burden of processing long text, enabling efficient training and inference. Empirically, TAGA achieves strong zero-shot and few-shot performance across eight real-world datasets, with substantial improvements over both graph-only pre-training and PLM baselines, and demonstrates robust transferability between domains. The combination of dual-view alignment, HDL, and efficient training makes TAGA a strong framework for universal TAG representations with practical, scalable utility.

Abstract

Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions, enabling detailed representation of data and their relationships across a broad spectrum of real-world scenarios. Despite the potential for deeper insights, existing TAG representation learning primarily relies on supervised methods, necessitating extensive labeled data and limiting applicability across diverse contexts. This paper introduces a new self-supervised learning framework, Text-And-Graph Multi-View Alignment (TAGA), which overcomes these constraints by integrating TAGs' structural and semantic dimensions. TAGA constructs two complementary views: Text-of-Graph view, which organizes node texts into structured documents based on graph topology, and the Graph-of-Text view, which converts textual nodes and connections into graph data. By aligning representations from both views, TAGA captures joint textual and structural information. In addition, a novel structure-preserving random walk algorithm is proposed for efficient training on large-sized TAGs. Our framework demonstrates strong performance in zero-shot and few-shot scenarios across eight real-world datasets.
Paper Structure (25 sections, 10 equations, 5 figures, 6 tables, 2 algorithms)

This paper contains 25 sections, 10 equations, 5 figures, 6 tables, 2 algorithms.

Figures (5)

  • Figure 1: Illustration of the two distinct views of TAGs: (left) Graph-of-Text and (right) Text-of-Graph. Graph-of-Text view constructs a graph-structured data over the individual text corpora, while Text-of-Graph view organizes the text node and their connection description in a hierarchical layout document. These two views can be mutually transformed to each other.
  • Figure 2: Illustration of the proposed self-supervised learning framework. (a) Generation of different orders of Graph-of-Text views; (b) The $\mathrm{Graph2Text}$ module that transforms a Graph-of-Text view into a Graph-of-Text view; (c) The alignment module via hierarchical self-supervised learning.
  • Figure 3: (top) Comparison of the full method and the random walk algorithm in terms of the number of words, and (middle) training time, and (bottom) inference time comparison between PLM and TAGA in terms of the number of hops.
  • Figure 4: Comparison of five-shot performance between (top) different GNN encoder choices, and (middle) varying jumping ratio, and (bottom) maximum walk length of random walks.
  • Figure : Hierarchical Document Layout (HDL) for Graph2Text