KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Aihua Pei; Zehua Yang; Shunan Zhu; Ruoxi Cheng; Ju Jia; Lina Wang

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Aihua Pei, Zehua Yang, Shunan Zhu, Ruoxi Cheng, Ju Jia, Lina Wang

TL;DR

KGPA introduces a knowledge-graph-based framework to evaluate the adversarial robustness of large language models across domain-diverse knowledge graphs, eliminating reliance on manually annotated benchmarks. It comprises modules that convert KG triplets to original prompts (T2P), generate adversarial prompts (KGB-FSA and APGP), and refine prompt quality with a Prompt Refinement Engine (PRE) using LLMScore, guided by a tau_llm threshold. Robustness is quantified via NRA, RRA, and ASR across general and specialized knowledge graphs and models such as GPT-3.5-turbo, GPT-4-turbo, and GPT-4o, revealing that robustness ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo and that domain knowledge influences performance. The approach demonstrates lower resource costs relative to benchmark-heavy frameworks while delivering actionable insights into cross-domain robustness and the effectiveness of different prompt-generation strategies.

Abstract

Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of large language models is influenced by the professional domains in which they operate.

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

TL;DR

Abstract

Paper Structure (24 sections, 3 equations, 13 figures, 15 tables)

This paper contains 24 sections, 3 equations, 13 figures, 15 tables.

Introduction
Related Works
Robustness Evaluation of LLMs
Attack Prompt Generation from KG
Few-Shot Attack Strategy
Methodology
Original Prompt Generation
Adversarial Prompt Generation
Robustness Evaluation Metrics
Experiments
Arrangement
Robustness Evaluation of ChatGPT Family
Experimental Analysis of KGPA Modules
Conclusion
Experimentation Details
...and 9 more sections

Figures (13)

Figure 1: Framework of Knowledge Graph Based PromptAttack (KGPA)
Figure 2: LLM-based transformation strategie example
Figure 3: Knowledge graph-based few-shot attack strategy module (KGB-FSA)
Figure 4: APGP: Adversarial prompt generation
Figure 5: The basic architecture of prompt refinement engine (PRE) module
...and 8 more figures

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

TL;DR

Abstract

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Authors

TL;DR

Abstract

Table of Contents

Figures (13)