Table of Contents
Fetching ...

PromptExp: Multi-granularity Prompt Explanation of Large Language Models

Ximing Dong, Shaowei Wang, Dayi Lin, Gopi Krishnan Rajbahadur, Boquan Zhou, Shichao Liu, Ahmed E. Hassan

Abstract

Large Language Models excel in tasks like natural language understanding and text generation. Prompt engineering plays a critical role in leveraging LLM effectively. However, LLMs black-box nature hinders its interpretability and effective prompting engineering. A wide range of model explanation approaches have been developed for deep learning models, However, these local explanations are designed for single-output tasks like classification and regression,and cannot be directly applied to LLMs, which generate sequences of tokens. Recent efforts in LLM explanation focus on natural language explanations, but they are prone to hallucinations and inaccuracies. To address this, we introduce PromptExp , a framework for multi-granularity prompt explanations by aggregating token-level insights. PromptExp introduces two token-level explanation approaches: 1. an aggregation-based approach combining local explanation techniques, and 2. a perturbation-based approach with novel techniques to evaluate token masking impact. PromptExp supports both white-box and black-box explanations and extends explanations to higher granularity levels, enabling flexible analysis. We evaluate PromptExp in case studies such as sentiment analysis, showing the perturbation-based approach performs best using semantic similarity to assess perturbation impact. Furthermore, we conducted a user study to confirm PromptExp's accuracy and practical value, and demonstrate its potential to enhance LLM interpretability.

PromptExp: Multi-granularity Prompt Explanation of Large Language Models

Abstract

Large Language Models excel in tasks like natural language understanding and text generation. Prompt engineering plays a critical role in leveraging LLM effectively. However, LLMs black-box nature hinders its interpretability and effective prompting engineering. A wide range of model explanation approaches have been developed for deep learning models, However, these local explanations are designed for single-output tasks like classification and regression,and cannot be directly applied to LLMs, which generate sequences of tokens. Recent efforts in LLM explanation focus on natural language explanations, but they are prone to hallucinations and inaccuracies. To address this, we introduce PromptExp , a framework for multi-granularity prompt explanations by aggregating token-level insights. PromptExp introduces two token-level explanation approaches: 1. an aggregation-based approach combining local explanation techniques, and 2. a perturbation-based approach with novel techniques to evaluate token masking impact. PromptExp supports both white-box and black-box explanations and extends explanations to higher granularity levels, enabling flexible analysis. We evaluate PromptExp in case studies such as sentiment analysis, showing the perturbation-based approach performs best using semantic similarity to assess perturbation impact. Furthermore, we conducted a user study to confirm PromptExp's accuracy and practical value, and demonstrate its potential to enhance LLM interpretability.

Paper Structure

This paper contains 35 sections, 5 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Workflow of aggregation-based approach. We show a running example with the prompt "info about US basketball team". In Stage 1, we calculate the importance score at token-level for each round of the generation. For instance, at the third round "(3) as", the importance scores for input tokens are "info (0.14)", "about (0.03)", "US (0.17)", "basketball (0.25)", "team (0.08)", "referred (0.15)", and "as (0.18)". In stage 2, we aggregate the importance scores for each token in the prompt across different rounds. For instance, the importance score for "info" is calculated based on the aggregation of "info" across all rounds highlighted in the red dashed box.
  • Figure 2: UI of PromptExp.
  • Figure 3: The visualization of a prompt "list the capital city and its location of the following countries: China, USA, Canada, and Japan." with our proposed approaches. Note that Perb$_{Sim}$ and Perb$_{Dic}$ are more likely to have zero importance scores for certain tokens since they measure the similarity of output. If removing tokens does not change the output, the importance of that token is 0.
  • Figure 4: Impact of $K$ on PromptExp using Perb$_{Log}$ (left), and impact of $M$ on PromptExp using Agg$_{Equ}$ and Agg$_{Conf}$ (right).
  • Figure 5: Word-level explanation for a prompt "What does AC stand for? Explain the answer like I am five" in Task 1 of the user study. The output is "Years old. Sure, I'd be happy to explain what AC stands for in a way that a five-year-old can understand! AC stands for "Air Conditioning." ... ".
  • ...and 4 more figures