Table of Contents
Fetching ...

Prism-$Δ$: Differential Subspace Steering for Prompt Highlighting in Large Language Models

Yuyao Ge, Shenghua Liu, Yiwei Wang, Tianyu Liu, Baolong Bi, Lingrui Mei, Jiayu Yao, Jiafeng Guo, Xueqi Cheng

TL;DR

This work proposes PRISM-$\Delta$ (Projection-based Relevance-Informed Steering Method), which decomposes the difference between positive and negative cross-covariance matrices to maximize discriminative energy while eliminating shared directions.

Abstract

Prompt highlighting steers a large language model to prioritize user-specified text spans during generation. A key challenge is extracting steering directions that capture the difference between relevant and irrelevant contexts, rather than shared structural patterns common to both. We propose PRISM-$Δ$ (Projection-based Relevance-Informed Steering Method), which decomposes the difference between positive and negative cross-covariance matrices to maximize discriminative energy while eliminating shared directions. Each attention head receives a continuous softplus importance weight, letting weak-but-useful heads contribute at reduced strength. The framework extends naturally to Value representations, capturing content-channel signal that Key-only methods leave unused. Across four benchmarks and five models, PRISM-$Δ$ matches or exceeds the best existing method on 19 of 20 configurations, with relative gains up to +10.6%, while halving the fluency cost of steering. PRISM-$Δ$ also scales to long-context retrieval, outperforming the best existing method by up to +4.8% relative gain. PRISM-$Δ$ is compatible with FlashAttention and adds negligible memory overhead.

Prism-$Δ$: Differential Subspace Steering for Prompt Highlighting in Large Language Models

TL;DR

This work proposes PRISM- (Projection-based Relevance-Informed Steering Method), which decomposes the difference between positive and negative cross-covariance matrices to maximize discriminative energy while eliminating shared directions.

Abstract

Prompt highlighting steers a large language model to prioritize user-specified text spans during generation. A key challenge is extracting steering directions that capture the difference between relevant and irrelevant contexts, rather than shared structural patterns common to both. We propose PRISM- (Projection-based Relevance-Informed Steering Method), which decomposes the difference between positive and negative cross-covariance matrices to maximize discriminative energy while eliminating shared directions. Each attention head receives a continuous softplus importance weight, letting weak-but-useful heads contribute at reduced strength. The framework extends naturally to Value representations, capturing content-channel signal that Key-only methods leave unused. Across four benchmarks and five models, PRISM- matches or exceeds the best existing method on 19 of 20 configurations, with relative gains up to +10.6%, while halving the fluency cost of steering. PRISM- also scales to long-context retrieval, outperforming the best existing method by up to +4.8% relative gain. PRISM- is compatible with FlashAttention and adds negligible memory overhead.
Paper Structure (67 sections, 1 theorem, 8 equations, 9 figures, 17 tables, 1 algorithm)

This paper contains 67 sections, 1 theorem, 8 equations, 9 figures, 17 tables, 1 algorithm.

Key Result

Proposition 1

Let $\Omega_\Delta = U_\Delta \Sigma_\Delta V_\Delta^\top$ be the SVD of the differential cross-covariance. Then:

Figures (9)

  • Figure 1: Overview of Prism-$\Delta$. SVD decomposes $\Omega_\Delta$ into per-head projections ($P_K$, $P_V$) and importance weights ($w_{\ell,h}$), steering both Key and Value channels at inference.
  • Figure 2: Dual-channel discriminative signals in Qwen3-4B-Base (288 heads). (a) Each point is one attention head; Key and Value shifts are weakly correlated ($r{=}0.342$), confirming that the two channels carry complementary information. (b) Key signal peaks in middle layers (L13--24), while Value signal peaks in late layers (L25--36), suggesting functional specialization across depth.
  • Figure 3: Projection matrix structure for layer 21, head 4 of Qwen3-4B with $d{=}128$. Independent projections $P^+$ (rank 89) and $P^-$ (rank 39) exhibit overlapping subspaces ($\mathrm{tr}(P^+ P^-) = 1.31$), while the differential projection $P_\Delta$ (rank 89) directly targets the discriminative subspace.
  • Figure 4: Direction consistency analysis. $\Omega^+$ directions show high cross-head similarity (dominated by shared structural directions), while $\Omega_\Delta$ directions are nearly independent (close to random baseline), confirming that differential projection extracts head-specific discriminative directions.
  • Figure 5: Head weight heatmaps (36 layers $\times$ 8 heads). Left:Prism-$\Delta$ softplus weights show continuous gradation across 288 heads (range $[0.654, 0.808]$). Right: SEKA hard threshold ($\delta{=}0.12$) creates a binary partition, shutting off 108 heads entirely---including 90% of early-layer heads.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Proposition 1: Discriminative optimality of differential directions
  • proof