Table of Contents
Fetching ...

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Aihua Pei, Zehua Yang, Shunan Zhu, Ruoxi Cheng, Ju Jia, Lina Wang

TL;DR

KGPA introduces a knowledge-graph-based framework to evaluate the adversarial robustness of large language models across domain-diverse knowledge graphs, eliminating reliance on manually annotated benchmarks. It comprises modules that convert KG triplets to original prompts (T2P), generate adversarial prompts (KGB-FSA and APGP), and refine prompt quality with a Prompt Refinement Engine (PRE) using LLMScore, guided by a tau_llm threshold. Robustness is quantified via NRA, RRA, and ASR across general and specialized knowledge graphs and models such as GPT-3.5-turbo, GPT-4-turbo, and GPT-4o, revealing that robustness ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo and that domain knowledge influences performance. The approach demonstrates lower resource costs relative to benchmark-heavy frameworks while delivering actionable insights into cross-domain robustness and the effectiveness of different prompt-generation strategies.

Abstract

Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of large language models is influenced by the professional domains in which they operate.

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

TL;DR

KGPA introduces a knowledge-graph-based framework to evaluate the adversarial robustness of large language models across domain-diverse knowledge graphs, eliminating reliance on manually annotated benchmarks. It comprises modules that convert KG triplets to original prompts (T2P), generate adversarial prompts (KGB-FSA and APGP), and refine prompt quality with a Prompt Refinement Engine (PRE) using LLMScore, guided by a tau_llm threshold. Robustness is quantified via NRA, RRA, and ASR across general and specialized knowledge graphs and models such as GPT-3.5-turbo, GPT-4-turbo, and GPT-4o, revealing that robustness ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo and that domain knowledge influences performance. The approach demonstrates lower resource costs relative to benchmark-heavy frameworks while delivering actionable insights into cross-domain robustness and the effectiveness of different prompt-generation strategies.

Abstract

Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of large language models is influenced by the professional domains in which they operate.
Paper Structure (24 sections, 3 equations, 13 figures, 15 tables)

This paper contains 24 sections, 3 equations, 13 figures, 15 tables.

Figures (13)

  • Figure 1: Framework of Knowledge Graph Based PromptAttack (KGPA)
  • Figure 2: LLM-based transformation strategie example
  • Figure 3: Knowledge graph-based few-shot attack strategy module (KGB-FSA)
  • Figure 4: APGP: Adversarial prompt generation
  • Figure 5: The basic architecture of prompt refinement engine (PRE) module
  • ...and 8 more figures