GRAPHTEXTACK: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs
Jiaji Ma, Puja Trivedi, Danai Koutra
TL;DR
The paper tackles the vulnerability of text-attributed graphs with LLM-enhanced GNNs to realistic black-box multi-modal adversaries. It proposes GraphTextack, an evolutionary optimization framework that jointly engineers node injections by modifying structure and semantics, without gradients or surrogates, using a fitness function that combines local prediction disruption and global structural influence. The method demonstrates superior degradation of node classification performance over 12 baselines across five datasets and two target models, while maintaining efficient runtime. These findings highlight the need for defenses such as joint adversarial training and data filtering to secure LLM-enhanced GNN systems in practical settings, where injected nodes can influence learning without modifying existing data.
Abstract
Text-attributed graphs (TAGs), which combine structural and textual node information, are ubiquitous across many domains. Recent work integrates Large Language Models (LLMs) with Graph Neural Networks (GNNs) to jointly model semantics and structure, resulting in more general and expressive models that achieve state-of-the-art performance on TAG benchmarks. However, this integration introduces dual vulnerabilities: GNNs are sensitive to structural perturbations, while LLM-derived features are vulnerable to prompt injection and adversarial phrasing. While existing adversarial attacks largely perturb structure or text independently, we find that uni-modal attacks cause only modest degradation in LLM-enhanced GNNs. Moreover, many existing attacks assume unrealistic capabilities, such as white-box access or direct modification of graph data. To address these gaps, we propose GRAPHTEXTACK, the first black-box, multi-modal{, poisoning} node injection attack for LLM-enhanced GNNs. GRAPHTEXTACK injects nodes with carefully crafted structure and semantics to degrade model performance, operating under a realistic threat model without relying on model internals or surrogate models. To navigate the combinatorial, non-differentiable search space of connectivity and feature assignments, GRAPHTEXTACK introduces a novel evolutionary optimization framework with a multi-objective fitness function that balances local prediction disruption and global graph influence. Extensive experiments on five datasets and two state-of-the-art LLM-enhanced GNN models show that GRAPHTEXTACK significantly outperforms 12 strong baselines.
