Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level

Runlin Lei; Yuwei Hu; Yuchen Ren; Zhewei Wei

Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level

Runlin Lei, Yuwei Hu, Yuchen Ren, Zhewei Wei

TL;DR

This paper pioneer the exploration of GIAs at the text level, presenting three novel attack designs that inject textual content into the graph and demonstrating that text interpretability, a factor previously overlooked at the embedding level, plays a crucial role in attack strength.

Abstract

Graph Neural Networks (GNNs) excel across various applications but remain vulnerable to adversarial attacks, particularly Graph Injection Attacks (GIAs), which inject malicious nodes into the original graph and pose realistic threats. Text-attributed graphs (TAGs), where nodes are associated with textual features, are crucial due to their prevalence in real-world applications and are commonly used to evaluate these vulnerabilities. However, existing research only focuses on embedding-level GIAs, which inject node embeddings rather than actual textual content, limiting their applicability and simplifying detection. In this paper, we pioneer the exploration of GIAs at the text level, presenting three novel attack designs that inject textual content into the graph. Through theoretical and empirical analysis, we demonstrate that text interpretability, a factor previously overlooked at the embedding level, plays a crucial role in attack strength. Among the designs we investigate, the Word-frequency-based Text-level GIA (WTGIA) is particularly notable for its balance between performance and interpretability. Despite the success of WTGIA, we discover that defenders can easily enhance their defenses with customized text embedding methods or large language model (LLM)--based predictors. These insights underscore the necessity for further research into the potential and practical significance of text-level GIAs.

Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level

TL;DR

Abstract

Paper Structure (43 sections, 1 theorem, 17 equations, 7 figures, 44 tables, 4 algorithms)

This paper contains 43 sections, 1 theorem, 17 equations, 7 figures, 44 tables, 4 algorithms.

Introduction
Background and Preliminaries
Text-Level GIAs: Interpretability Matters
Inversion-based Text-Level GIAs: Effective yet Uninterpretable
LLM-based Text-Level GIAs: Interpretable yet Ineffective
Word-Frequency-based Text-Level GIAs
Row-wise Constrained FGSM: Unnoticeable and Effective at the Embedding Level
Trade-off for Interpretability at the Text Level
New Challenges for Text-level GIAs
Transferbility to Different Embeddings
LLMs as Defender
Conclusion
Acknowledgement
Broader Impacts
Open Access to Data and Code
...and 28 more sections

Key Result

Theorem 1

In the setting outlined in Definition Def:1, assume we apply a cosine similarity constraint with a threshold $c \in (0, 1)$ for unnoticeability. Specifically, this constraint requires that the cosine similarity between $x_t$ and $x_i$ satisfies $\frac{x_t \cdot x_i}{\|x_t\| \|x_i\|} > c$. Let $a$ de

Figures (7)

Figure 1: Illustration of the Text-Level GIA setup and the three designs explored.
Figure 2: The change of attack performance as the weight of HAO increases. Lower performance stands for better attack results. Details of the setting is given in Appendix \ref{['app:hyp']}.
Figure 3: The Best and Average performance of FGSM against GCN and EGNNGuard among the five injection methods w.r.t increasing sparsity budgets. The results for EGNNGuard reveal that as the budget increases, FGSM attacks can satisfy the similarity constraint while significantly enhancing the attack performance at the embedding level.
Figure 4: Performance of WTGIA against GCN. Sparsity budget is the average sparsity of the original dataset. Methods with -T include topic requirements in the prompt. Methods with -WM exclude masks for prohibited words in Llama. Avg Emb. represents the average FGSM attack performance at the embedding level. Lower values indicate better attack performance.
Figure 5: The performance of WTGIA w/o topic against GCN w.r.t sparsity budget. As the budget increases, the use rate keeps decreasing, and the attack performance increases and then decreases.
...and 2 more figures

Theorems & Definitions (3)

Definition 1: Single node GIAs towards one-hot embedding
Theorem 1
proof

Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level

TL;DR

Abstract

Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (3)