Table of Contents
Fetching ...

Interactive Prompt Debugging with Sequence Salience

Ian Tenney, Ryan Mullins, Bin Du, Shree Pandya, Minsuk Kahng, Lucas Dixon

TL;DR

Sequence Salience tackles the challenge of debugging long, complex prompts for language models by making input salience interpretable and actionable. It introduces an interactive, visual tool that aggregates token-level salience to words, sentences, or paragraphs and supports rapid iteration through prompt editing and re-generation. The approach relies on gradient-based salience (gradnorm and grad-dot-input) implemented on the Lit platform with backends for Gemma, Llama 2, Mistral, and GPT-2, enabling efficient, model-agnostic debugging across few-shot, chain-of-thought, and constitution-style prompting. The paper demonstrates the utility via case studies and provides open-source code, fostering adoption of interpretability methods for prompt design. This has practical impact by reducing cognitive load and accelerating reliable prompt development for long, semi-structured prompts.

Abstract

We present Sequence Salience, a visual tool for interactive prompt debugging with input salience methods. Sequence Salience builds on widely used salience methods for text classification and single-token prediction, and extends this to a system tailored for debugging complex LLM prompts. Our system is well-suited for long texts, and expands on previous work by 1) providing controllable aggregation of token-level salience to the word, sentence, or paragraph level, making salience over long inputs tractable; and 2) supporting rapid iteration where practitioners can act on salience results, refine prompts, and run salience on the new output. We include case studies showing how Sequence Salience can help practitioners work with several complex prompting strategies, including few-shot, chain-of-thought, and constitutional principles. Sequence Salience is built on the Learning Interpretability Tool, an open-source platform for ML model visualizations, and code, notebooks, and tutorials are available at http://goo.gle/sequence-salience.

Interactive Prompt Debugging with Sequence Salience

TL;DR

Sequence Salience tackles the challenge of debugging long, complex prompts for language models by making input salience interpretable and actionable. It introduces an interactive, visual tool that aggregates token-level salience to words, sentences, or paragraphs and supports rapid iteration through prompt editing and re-generation. The approach relies on gradient-based salience (gradnorm and grad-dot-input) implemented on the Lit platform with backends for Gemma, Llama 2, Mistral, and GPT-2, enabling efficient, model-agnostic debugging across few-shot, chain-of-thought, and constitution-style prompting. The paper demonstrates the utility via case studies and provides open-source code, fostering adoption of interpretability methods for prompt design. This has practical impact by reducing cognitive load and accelerating reliable prompt development for long, semi-structured prompts.

Abstract

We present Sequence Salience, a visual tool for interactive prompt debugging with input salience methods. Sequence Salience builds on widely used salience methods for text classification and single-token prediction, and extends this to a system tailored for debugging complex LLM prompts. Our system is well-suited for long texts, and expands on previous work by 1) providing controllable aggregation of token-level salience to the word, sentence, or paragraph level, making salience over long inputs tractable; and 2) supporting rapid iteration where practitioners can act on salience results, refine prompts, and run salience on the new output. We include case studies showing how Sequence Salience can help practitioners work with several complex prompting strategies, including few-shot, chain-of-thought, and constitutional principles. Sequence Salience is built on the Learning Interpretability Tool, an open-source platform for ML model visualizations, and code, notebooks, and tutorials are available at http://goo.gle/sequence-salience.
Paper Structure (18 sections, 7 figures)

This paper contains 18 sections, 7 figures.

Figures (7)

  • Figure 1: Sequence Salience UI overview. The user can: (1) enter a prompt or edit an existing one, and optionally specify a target sequence to explain; (2) select a target sequence to explain, either a ground-truth sequence or an generation; (3) control the selection granularity (tokens, words, sentences, lines, or paragraphs), visual display density, and a choice of salience methods (here, grad_l2 and grad_dot_input); and (4) select a segment, which triggers the system to compute salience with respect to that segment, showing the scores as a heatmap over preceding segments. Darker colors mean that segment is more influential or salient to the selected target. Shift-click can be used to select multiple segments, e.g., words comprising a phrase or clause.
  • Figure 2: Sequence Salience showing a sentence-level salience map for a user-selected sentence ("Recommendation..."). Here, the map suggests that an "Analysis..." sentence in the few-shot prompt is highly salient, but visual review shows it is followed by an incorrect recommendation. Some intervening text is hidden for space; for the full heatmap see Figure \ref{['fig:constitutions-full']}.
  • Figure 3: Sequence Salience highlighting the influence of the constitutional principles the developer added to the beginning of the prompt, relative to the selected sentence ("Recommendation..."), helping to assess the effectiveness of prompt iterations. Some text is hidden for space; for the full heatmap see Figure \ref{['fig:constitutions-full']}.
  • Figure 4: Side-by-side, sentence-level Sequence Salience maps comparing results for two variants of a GSM8K example. The left side shows the original example and shows a diffuse salience map across the numerical values. The right side modifies the prompt to remove the calculation annotations, yielding a more focused salience map over the operands and relevant answers, which in turn reveals issues with specific arithmetic calculations.
  • Figure A.1: Full text heatmaps for examples from Figure \ref{['fig:few-shot']} and Figure \ref{['fig:constitutions']}.
  • ...and 2 more figures