Table of Contents
Fetching ...

GRAPHTEXTACK: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs

Jiaji Ma, Puja Trivedi, Danai Koutra

TL;DR

The paper tackles the vulnerability of text-attributed graphs with LLM-enhanced GNNs to realistic black-box multi-modal adversaries. It proposes GraphTextack, an evolutionary optimization framework that jointly engineers node injections by modifying structure and semantics, without gradients or surrogates, using a fitness function that combines local prediction disruption and global structural influence. The method demonstrates superior degradation of node classification performance over 12 baselines across five datasets and two target models, while maintaining efficient runtime. These findings highlight the need for defenses such as joint adversarial training and data filtering to secure LLM-enhanced GNN systems in practical settings, where injected nodes can influence learning without modifying existing data.

Abstract

Text-attributed graphs (TAGs), which combine structural and textual node information, are ubiquitous across many domains. Recent work integrates Large Language Models (LLMs) with Graph Neural Networks (GNNs) to jointly model semantics and structure, resulting in more general and expressive models that achieve state-of-the-art performance on TAG benchmarks. However, this integration introduces dual vulnerabilities: GNNs are sensitive to structural perturbations, while LLM-derived features are vulnerable to prompt injection and adversarial phrasing. While existing adversarial attacks largely perturb structure or text independently, we find that uni-modal attacks cause only modest degradation in LLM-enhanced GNNs. Moreover, many existing attacks assume unrealistic capabilities, such as white-box access or direct modification of graph data. To address these gaps, we propose GRAPHTEXTACK, the first black-box, multi-modal{, poisoning} node injection attack for LLM-enhanced GNNs. GRAPHTEXTACK injects nodes with carefully crafted structure and semantics to degrade model performance, operating under a realistic threat model without relying on model internals or surrogate models. To navigate the combinatorial, non-differentiable search space of connectivity and feature assignments, GRAPHTEXTACK introduces a novel evolutionary optimization framework with a multi-objective fitness function that balances local prediction disruption and global graph influence. Extensive experiments on five datasets and two state-of-the-art LLM-enhanced GNN models show that GRAPHTEXTACK significantly outperforms 12 strong baselines.

GRAPHTEXTACK: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs

TL;DR

The paper tackles the vulnerability of text-attributed graphs with LLM-enhanced GNNs to realistic black-box multi-modal adversaries. It proposes GraphTextack, an evolutionary optimization framework that jointly engineers node injections by modifying structure and semantics, without gradients or surrogates, using a fitness function that combines local prediction disruption and global structural influence. The method demonstrates superior degradation of node classification performance over 12 baselines across five datasets and two target models, while maintaining efficient runtime. These findings highlight the need for defenses such as joint adversarial training and data filtering to secure LLM-enhanced GNN systems in practical settings, where injected nodes can influence learning without modifying existing data.

Abstract

Text-attributed graphs (TAGs), which combine structural and textual node information, are ubiquitous across many domains. Recent work integrates Large Language Models (LLMs) with Graph Neural Networks (GNNs) to jointly model semantics and structure, resulting in more general and expressive models that achieve state-of-the-art performance on TAG benchmarks. However, this integration introduces dual vulnerabilities: GNNs are sensitive to structural perturbations, while LLM-derived features are vulnerable to prompt injection and adversarial phrasing. While existing adversarial attacks largely perturb structure or text independently, we find that uni-modal attacks cause only modest degradation in LLM-enhanced GNNs. Moreover, many existing attacks assume unrealistic capabilities, such as white-box access or direct modification of graph data. To address these gaps, we propose GRAPHTEXTACK, the first black-box, multi-modal{, poisoning} node injection attack for LLM-enhanced GNNs. GRAPHTEXTACK injects nodes with carefully crafted structure and semantics to degrade model performance, operating under a realistic threat model without relying on model internals or surrogate models. To navigate the combinatorial, non-differentiable search space of connectivity and feature assignments, GRAPHTEXTACK introduces a novel evolutionary optimization framework with a multi-objective fitness function that balances local prediction disruption and global graph influence. Extensive experiments on five datasets and two state-of-the-art LLM-enhanced GNN models show that GRAPHTEXTACK significantly outperforms 12 strong baselines.

Paper Structure

This paper contains 31 sections, 5 theorems, 14 equations, 2 figures, 7 tables, 1 algorithm.

Key Result

Proposition 5.1

Optimizing multi-modal node injection attacks in $G = (V, E)$ exactly requires searching over a space of size $O\left(|V|^{r \cdot d_{\max}} \times |\mathcal{F}|^r\right)$, which is exponentially large in the number of injected nodes $r$, the maximum degree per injected node $d_{\max}$, and the feat

Figures (2)

  • Figure 1: Runtime per injection on the representation-level enhancer target model.
  • Figure 2: [Extended results] Comparison of runtime to generate an injection on the representation-level enhancer target model.

Theorems & Definitions (8)

  • Proposition 5.1: Search Space of Multi-Modal Injection
  • Lemma 5.2: Polynomial-Time Evolutionary Approximation
  • Lemma B.1: Multi-Modal Adversarial Synergy
  • Corollary B.2: Efficiency
  • proof : Proof for Lemma \ref{['lemma:multimodal_synergy']}
  • Lemma B.3: Approximate Shift Bound
  • proof : Proof
  • proof : Proof Sketch for Lemma \ref{['lem:complexity']}