Table of Contents
Fetching ...

Dynamic Bundling with Large Language Models for Zero-Shot Inference on Text-Attributed Graphs

Yusheng Zhao, Qixin Zhang, Xiao Luo, Weizhi Zhang, Zhiping Xiao, Wei Ju, Philip S. Yu, Ming Zhang

TL;DR

This work tackles zero-shot inference on text-attributed graphs by addressing two key challenges: sparse graph-structure information presented to LLMs and unreliable LLM outputs. It introduces Dynamic Text Bundling Supervision (DENSE), which forms text bundles around nodes, queries LLMs for bundle-level labels, and uses entropy-based and ranking-based losses to supervise a graph neural network, with online bundle refinement to suppress noisy items. The authors provide theoretical results on the tolerance of bundle supervision to outliers and convergence of the training process, and demonstrate state-of-the-art performance across ten diverse TAG datasets using GPT-4o and other LLM backbones. The approach yields robust, scalable zero-shot inference for TAGs and offers a principled framework for integrating LLM guidance with GNNs in non-Euclidean, text-rich graphs, with potential impacts across citation, web, social, and knowledge-graph domains.

Abstract

Large language models (LLMs) have been used in many zero-shot learning problems, with their strong generalization ability. Recently, adopting LLMs in text-attributed graphs (TAGs) has drawn increasing attention. However, the adoption of LLMs faces two major challenges: limited information on graph structure and unreliable responses. LLMs struggle with text attributes isolated from the graph topology. Worse still, they yield unreliable predictions due to both information insufficiency and the inherent weakness of LLMs (e.g., hallucination). Towards this end, this paper proposes a novel method named Dynamic Text Bundling Supervision (DENSE) that queries LLMs with bundles of texts to obtain bundle-level labels and uses these labels to supervise graph neural networks. Specifically, we sample a set of bundles, each containing a set of nodes with corresponding texts of close proximity. We then query LLMs with the bundled texts to obtain the label of each bundle. Subsequently, the bundle labels are used to supervise the optimization of graph neural networks, and the bundles are further refined to exclude noisy items. To justify our design, we also provide theoretical analysis of the proposed method. Extensive experiments across ten datasets validate the effectiveness of the proposed method.

Dynamic Bundling with Large Language Models for Zero-Shot Inference on Text-Attributed Graphs

TL;DR

This work tackles zero-shot inference on text-attributed graphs by addressing two key challenges: sparse graph-structure information presented to LLMs and unreliable LLM outputs. It introduces Dynamic Text Bundling Supervision (DENSE), which forms text bundles around nodes, queries LLMs for bundle-level labels, and uses entropy-based and ranking-based losses to supervise a graph neural network, with online bundle refinement to suppress noisy items. The authors provide theoretical results on the tolerance of bundle supervision to outliers and convergence of the training process, and demonstrate state-of-the-art performance across ten diverse TAG datasets using GPT-4o and other LLM backbones. The approach yields robust, scalable zero-shot inference for TAGs and offers a principled framework for integrating LLM guidance with GNNs in non-Euclidean, text-rich graphs, with potential impacts across citation, web, social, and knowledge-graph domains.

Abstract

Large language models (LLMs) have been used in many zero-shot learning problems, with their strong generalization ability. Recently, adopting LLMs in text-attributed graphs (TAGs) has drawn increasing attention. However, the adoption of LLMs faces two major challenges: limited information on graph structure and unreliable responses. LLMs struggle with text attributes isolated from the graph topology. Worse still, they yield unreliable predictions due to both information insufficiency and the inherent weakness of LLMs (e.g., hallucination). Towards this end, this paper proposes a novel method named Dynamic Text Bundling Supervision (DENSE) that queries LLMs with bundles of texts to obtain bundle-level labels and uses these labels to supervise graph neural networks. Specifically, we sample a set of bundles, each containing a set of nodes with corresponding texts of close proximity. We then query LLMs with the bundled texts to obtain the label of each bundle. Subsequently, the bundle labels are used to supervise the optimization of graph neural networks, and the bundles are further refined to exclude noisy items. To justify our design, we also provide theoretical analysis of the proposed method. Extensive experiments across ten datasets validate the effectiveness of the proposed method.

Paper Structure

This paper contains 15 sections, 3 theorems, 9 equations, 5 figures, 3 tables.

Key Result

Theorem 3.1

Given a bundle $\mathcal{B}$, its corresponding bundle class distribution $\bm p(\mathcal{B})=(p_1, p_2, \dots, p_C)$, an outlier node $v_o,o\in \mathcal{B}$ with probability distribution $\bm p_o=(p'_1, p'_2, \dots, p'_C)$, denote $m'=\operatorname{argmax}_i \{p'_i\}_{i=1}^C$. If the bundle label $ where $\hat{y}$ is the bundle label, $\mathcal{L}_{BE}$ is bundle supervision and $\mathcal{L}_{IE}

Figures (5)

  • Figure 1: (a) Querying LLMs with individual texts and supervising graph learning with individual labels. (b) By creating text bundles, we perform bundle queries to obtain bundle labels for supervision.
  • Figure 2: The overall framework of our method. We first sample nodes of proximity to form bundles (a), which are then used to query the LLM about their main categories (b). Subsequently, the bundle labels from the LLM's response are used to supervise a graph neural network (c). During optimization, we further refine the bundle to exclude noisy nodes (d).
  • Figure 3: Left: prediction accuracies under different bundle sizes (i.e., $n_B$). Middle: prediction accuracies with different numbers of bundles (i.e., $n_S$). Right: accuracy comparison of individual query (I.Q.), bundle query (B.Q.), and our method (Ours).
  • Figure 4: The prompt template of bundle query (a), an example of the prompt on the CiteSeer dataset (b), and an example of the response of GPT-4o to the query (c).
  • Figure : The prediction accuracies under different LLM backbones on four datasets. The best is marked in bold and the second-best underline.

Theorems & Definitions (7)

  • Theorem 3.1
  • Remark 1
  • Theorem 3.2
  • Remark 2
  • Remark 3
  • Theorem 3.3
  • Remark 4