Table of Contents
Fetching ...

Spectral Attention Steering for Prompt Highlighting

Weixian Waylon Li, Yuchen Niu, Yongxin Yang, Keshuang Li, Tiejun Ma, Shay B. Cohen

TL;DR

Spectral Editing Key Amplification (SEKA), a training-free steering method that tackles spectral decomposition by directly editing key embeddings before attention computation by directly editing key embeddings before attention computation, is introduced.

Abstract

Attention steering is an important technique for controlling model focus, enabling capabilities such as prompt highlighting, where the model prioritises user-specified text. However, existing attention steering methods require explicit storage of the full attention matrix, making them incompatible with memory-efficient implementations like FlashAttention. We introduce Spectral Editing Key Amplification (SEKA), a training-free steering method that tackles this by directly editing key embeddings before attention computation. SEKA uses spectral decomposition to steer key embeddings towards latent directions that amplify attention scores for certain tokens. We extend this to Adaptive SEKA (AdaSEKA), a query-adaptive variant that uses a training-free routing mechanism to dynamically combine multiple expert subspaces based on the prompt's semantic intent. Our experiments show both methods significantly outperform strong baselines on standard steering benchmarks while adding much lower latency and memory overhead, in compatibility with optimised attention.

Spectral Attention Steering for Prompt Highlighting

TL;DR

Spectral Editing Key Amplification (SEKA), a training-free steering method that tackles spectral decomposition by directly editing key embeddings before attention computation by directly editing key embeddings before attention computation, is introduced.

Abstract

Attention steering is an important technique for controlling model focus, enabling capabilities such as prompt highlighting, where the model prioritises user-specified text. However, existing attention steering methods require explicit storage of the full attention matrix, making them incompatible with memory-efficient implementations like FlashAttention. We introduce Spectral Editing Key Amplification (SEKA), a training-free steering method that tackles this by directly editing key embeddings before attention computation. SEKA uses spectral decomposition to steer key embeddings towards latent directions that amplify attention scores for certain tokens. We extend this to Adaptive SEKA (AdaSEKA), a query-adaptive variant that uses a training-free routing mechanism to dynamically combine multiple expert subspaces based on the prompt's semantic intent. Our experiments show both methods significantly outperform strong baselines on standard steering benchmarks while adding much lower latency and memory overhead, in compatibility with optimised attention.
Paper Structure (53 sections, 13 equations, 10 figures, 14 tables, 2 algorithms)

This paper contains 53 sections, 13 equations, 10 figures, 14 tables, 2 algorithms.

Figures (10)

  • Figure 1: Visualisation of pairwise key embedding shifts across different (layer, head) in Qwen3-1.7B-Base via PCA. Positive vs. negative representations are plotted for 26 shared token spans. Grey arrows trace individual shifts; the dark blue arrow shows the average displacement.
  • Figure 2: An overview of SEKA and AdaSEKA. ${\bm{x}}$: context; ${\bm{h}}$: key embedding; ${\bm{\Omega}}$: cross-covariance; ${\bm{U}}$: left singular vectors; ${\bm{S}}$: singular values; $g$: gain coefficient. SEKA applies fixed gains, while AdaSEKA uses the query to compute dynamic steering weights.
  • Figure 3: Heatmaps of the average per-token $\ell_2$ distance between positive and negative key embeddings across all KV heads and layers for four Qwen3 model sizes. Higher values (green) indicate greater separation between positive and negative key representations.
  • Figure 4: Exact match scores on the lost-in-the-middle task for Qwen3 models of three different sizes, comparing the original model, PASTA/SEKA applied to the middle region (5 to 25 passages), and PASTA/SEKA applied to all passages.
  • Figure 5: Exact match scores when applying SEKA to the middle region with different threshold $\delta_\text{min}$.
  • ...and 5 more figures